**------------------------------------------------------------------------------------------------ * @header_start * WebGrab+Plus ini for grabbing EPG data from TvGuide websites * @Site: tv.dir.bg * @MinSWversion: V1.1.1/52 * @Revision 2 - [28/08/2013] Francis De Paemeleere * remove detail page grabbing (because it is many time not the correct page) * @Revision 1 - [25/08/2013] Francis De Paemeleere * small update * @Revision 0 - [10/12/2012] Jan van Straaten * creation * @Remarks: * this file should be saved with encoding charset windows-1252! * @header_end **------------------------------------------------------------------------------------------------ * site {url=tv.dir.bg|timezone=UTC+02:00|maxdays=6|cultureinfo=bg-BG|charset=windows-1251|titlematchfactor=90|episodesystem=xmltv_ns} url_index{url|http://tv.dir.bg/tv_search.php} url_index.headers {customheader=Accept-Encoding=gzip} *Cinemax: *full postdata for Cinemax: *step=1&f_tv%5B%5D=90&f_week%5B%5D=07.12&f_sub=&f_search=&all=&%C4%C8%D0%C8.x=24&%C4%C8%D0%C8.y=8 url_index.headers {method=POST|postdata=step=1&f_tv%5B%5D='channel'&f_week%5B%5D='urldate'&f_sub=&f_search=&all} urldate.format {datestring|dd.MM} index_showsplit.scrub {multi()|
||
|} * removed the index_urlshow, because it is not always pointing to the correct detail_page * index_urlshow.scrub {single||||} * index_urlshow.modify {substring(type=regex)|\\"\\"} index_urlchannellogo {url||
} index_start.scrub {single()||||
} index_title.scrub {single()||||} index_category.scrub {single(separator=", " include=2)|"||
|
} index_productiondate.scrub {single|"||
|
} index_videoquality.scrub {single||(|)|
} scope.range {(indexshowdetails)|end} index_start.modify {replace|.|:} index_title.modify {cleanup(tags="<"">")} index_actor.modify {substring(type=regex)|'index_title' "(?:Актьори\|в ролите): *(.*?)(?:, *(.*))"} * get actors index_title.modify {remove(type=regex)|'index_title' "((?:Актьори\|в ролите): *(?:.*?)(?:, *(?:.*)))"} * remove actors from title index_actor.modify {replace|,|\|} index_actor.modify {cleanup} index_director.modify {substring(type=regex)|'index_title' "(?:Режисьор\|реж\.):\s*(.*?),"} * get director index_title.modify {remove(type=regex)|'index_title' "((?:Режисьор\|реж\.):\s*(?:.*?),)"} * remove director from title index_director.modify {cleanup} index_temp_2.modify {substring(type=regex)|'index_title' "\((\d+) епизод\)"} index_temp_1.modify {calculate(not="" format=F0)|1 -} * make season xmltv_ns index_temp_2.modify {calculate(not="" format=F0)|1 -} * make episode xmltv_ns index_episode.modify {clear} index_episode.modify {addend('index_temp_1' not="")|'index_temp_1'} index_episode.modify {addend()|.} index_episode.modify {addend('index_temp_2' not="")|'index_temp_2'} index_episode.modify {addend()|.} index_episode.modify {clear(="..")} index_category.modify {substring(type=regex)|'index_title' "^.*?[–-] ([^–(-]*)"} *** move cat to description if > 4 words index_temp_4.modify {calculate(type=word format=F0)|'index_category' #} * words , if > 4 part of description index_description.modify {addstart('index_temp_4' > "4")|'index_category'} index_category.modify {clear('index_temp_4' > "4")} index_videoquality.modify {replace(~~ "HD")|'index_video_quality'|HD} index_videoquality.modify {clear(not ~~ "HD")} index_actor.modify {remove()|(HD)} index_title.modify {remove(type=regex)|"^.*?(\s*[–-] .*)"} index_title.modify {remove|"} index_title.modify {remove|“} index_title.modify {remove|”} index_title.modify {remove|„} end_scope *title.scrub {single()||||} *description.scrub {multi(separator="

" exclude="")|

|||} *description.scrub {multi(separator="
" exclude="http://")|За шоуто:|">||} * alternative in case of news etc *actor.scrub {single(separator=", ")|В ролите:||br />|
} *producer.scrub {single(separator=", ")|Режисьор:||

|

} * *scope.range {(showdetails)|end} *title.modify {substring(type=regex)|"^(.*?)(?: /)"} *description.modify {cleanup} *subtitle.modify {substring(type=regex)|'description' "^([\"“].*?[\"”]) *[:–-]"} *subtitle.modify {remove|"} *subtitle.modify {remove|“} *subtitle.modify {remove|”} *subtitle.modify {remove|„} *description.modify {remove(type=regex)|"^([\"“].*?[\"”] *[:–-])"} *actor.modify {remove|<} *actor.modify {cleanup} *actor.modify {cleanup(removeduplicates=name, 50)} *end_scope.range * ** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ** ##### CHANNEL FILE CREATION (only to create the tv-dir.bg.channel.xml file) ** ** @auto_xml_channel_start ** extraction not erfect, remove the obvious non channels by hand *index_site_channel.scrub {multi||} *index_site_id.scrub {multi|} *index_site_id.modify {remove| selected} ** @auto_xml_channel_end