You are here

2M maroc

2 posts / 0 new
Last post
doglover
Offline
Joined: 11 years
Last seen: 3 years
2M maroc

I am trying to make a grab for the schedule of  m maroc.
This is where I stranded:

* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
* Site :2m.ma
* revision : 0
* Willy 7/2013
*
site {url=2m.ma|timezone=UTC+00:00|maxdays=7|cultureinfo=fr|charset=UTF-8|titlematchfactor=90}
url_index {url (debug)|http://www.2m.ma/layout/set/ajax/guidetv/guide2mTranche/|subpage|/|urldate|/NATIONAL}
urldate.format {datenumber|unix|0}
subpage.format{list|matinee|aprem|soiree}

index_showsplit.scrub {multi (debug ) |<li class=|||</li}
*index_showsplit.modify {cleanup(removeduplicates)}
index_start.scrub {single (separator="-" include=first)|<span>||</span}
index_stop.scrub {single (separator="-" include=last)|<span>||</span}
index_title.scrub {single|<h5>||</h5>}

index_start.modify {replace|h|:}
index_stop.modify {replace|h|:}

The problem is the first day grabbed.  It will list the shows, but the expired shows will be listed at the end.
This will play havoc on the daybreak determination.  The solution should be either to sort the shows or simply delete the expired shows (the are xpired, so this does not matter)
Neither of these solutins I could get to work.
Help is aprreciated.
 
Willy

WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 12 years
Last seen: 33 min
Is the support helpful?
support us

Hi Wily,
 
the problem with m2.ma is that the shows in the index pages are not in time ascending order. To solve that we have to use the sort command (see 4.6.4.9 of the manual). To do that you have to create the 'sort_by' element which is derived from the start time. That must be done day by day because the start times are in the range 0 to 24 every day , so if you sort for multiple days at once they will all get mixed up. Further we must increase the sort_by value for the values that are derived from the early hours shows from 0:00 to 6:00 because otherwise they will end up at the beginning of the day.
 
Below you will find that part of the ini, I hope you can finish the rest of the elements.
 
(To use it you must upgrade to beta build 51.4, because there was a small bug in the conditional arguments for sort_by)
http://www.webgrabplus.com/sites/default/files/patchexe_prebuild.zip
 
 
Jan
 
**------------------------------------------------------------------------------------------------
* @header_start
* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
* @Site: 2m.ma
* @MinSWversion:
* @Revision 0 - [your_date] Willy de Wilde/ Jan van Straaten
* - your_comments
* @Remarks: your_remarks
* @header_end
**------------------------------------------------------------------------------------------------
*
site {url=2m.ma|timezone=UTC+00:00|maxdays=7|cultureinfo=fr|charset=UTF-8|titlematchfactor=90}
url_index {url (debug)|http://www.2m.ma/layout/set/ajax/guidetv/guide2mTranche/|subpage|/|urldate|/NATIONAL}
urldate.format {datenumber|unix|5:00}
subpage.format{list|matinee|aprem|soiree}
*
index_showsplit.scrub {multi(debug)|<div class="items">|||<script type="text/javascript">} * the day segments
scope.range {(splitindex)|end}
index_showsplit.modify {|}
* add per day together
index_temp_1.modify {calculate(debug type=element format=F0)|'index_showsplit' #}
loop {('index_temp_1' > "0" max=20)|end}
index_temp_1.modify {calculate(format=F0)|3 -}
index_temp_6.modify {substring(debug type=element)|'index_showsplit' 'index_temp_1' 3} * one day
index_temp_6.modify {replace()|<img src=|\n\|<img src=} * split in shows
index_temp_6.modify {select()|"<img src="http://d57e32cb.static.ziggozakelijk.nl/%20~%7D%20%2A%20only%20real%20shows__index_temp_6.modify%20%7Bsort%28ascending%2Cinteger%29%7D__%2A%20compose%20the%20sort_by%20element%20to%20use%20for%20sorting%20index_temp_6__sort_by.scrub%20%7Bsingle%28target%3D"index_temp_6")|<span>|| - |</span>}
sort_by.modify {replace(target="index_temp_6")|h|:}
sort_by.modify {calculate(target="index_temp_6" format=F0)|100 *}
sort_by.modify {calculate(< "600" target="index_temp_6" format=F0)|2400 +}
index_temp_6.modify {replace|\||####} * make single
index_temp_5.modify {addstart|####'index_temp_6'} * collect all sorted shows
end_loop
index_showsplit.modify {clear}
index_showsplit.modify {addstart|'index_temp_5'}
index_showsplit.modify {replace|####|\|} * back to multi
end_scope
*
index_start.scrub {single (separator="-" include=first)|<span>||</span>|</span>}
index_stop.scrub {single (separator="-" include=last)|<span>||</span>|</span>}
index_title.scrub {single|<h5>||</h5>}
index_start.modify {replace|h|:}
index_stop.modify {replace|h|:}
** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file)
**
** @auto_xml_channel_start
** single channel, not in url
*index_site_id.modify {|}
*index_site_channel.modify {addstart|M2}
*index_site_id.modify {addstart|xx}
** @auto_xml_channel_end

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl