**------------------------------------------------------------------------------------------------ * @header_start * WebGrab+Plus ini for grabbing IMDB data from TvGuide websites * @MinSWversion: V1.1.1/56.25 * - (postprocess V2.0) * @Site: imdb.com * @Revision 7 - [25/05/2016] Jan van Straaten * - support for match on episode-num or sub-title, use of new scopes * - added mdbinitype * @Revision 6 - [12/02/2016] Jan van Straaten * - fix of the episode and productiondate * @Revision 5 - [07/12/2015] Jan van Straaten * - change element names, mdb_show_id and mdb_episode_id * @Revision 4 - [30/09/2015] Jan van Straaten * - added mdb-category * @Revision 3 - [11/08/2014] Jan van Straaten * - improved mdb_episode_id selection * @Revision 2 - [09/06/2014] Jan van Straaten * - added url header * @Revision 1 - [24/11/2013] Jan van Straaten * - version check enabled * @Revision 0 - [09/11/2013] Jan van Straaten * - creation * @Remarks: Series data extraction. English version * @header_end **------------------------------------------------------------------------------------------------ * * site {url=imdb.com|mdbinitype=serie|cultureinfo=en-GB|charset=UTF-8|matchfactor=70|searchsite=imdb|episodesystem=xmltv_ns} scope.range {(primarysearch)|end} * primary search (using imdb's advanced search): url_primarysearch {url()|http://www.imdb.com/search/title?&title=|'title'|&title_type=tv_series} url_primarysearch.modify {replace| |%20} *http://www.imdb.com/search/title?title=Touched%20by%20an%20Angel&title_type=tv_series url_primarysearch.headers {customheader=Accept-Encoding=gzip,deflate} mdb_show_id.scrub {multi|primary||} * * imdb url's: url_mdb_p1.modify {addstart()|http://www.imdb.com/title/tt'mdb_show_id'/epdate} * all the episodes date sorted with episode title and mdb_episode_id url_mdb_p2.modify {addstart()|http://www.imdb.com/title/tt'mdb_episode_id'} * the episode detail page * or http://www.imdb.com/title/tt0553267/?ref_=tt_ep_pr = same url_mdb_p3.modify {addstart|http://www.imdb.com/title/tt'mdb_episode_id'/synopsis?ref_=tt_stry_pl} * the full synopsis * is same as * http://www.imdb.com/title/tt2288518/synopsis full synopsis url_mdb_p4.modify {addstart|http://www.imdb.com/title/tt'mdb_episode_id'/fullcredits?ref_=tt_ql_1} *full cast and crew (director, writer, actor) * http://www.imdb.com/title/tt2288518/fullcredits?ref_=tt_ql_1 url_mdb_p5.modify {addstart|http://www.imdb.com/title/tt'mdb_episode_id'/plotsummary?ref_=tt_ql_5} *plot summary (not used) url_mdb_p6.modify {addstart|http://www.imdb.com/title/tt'mdb_episode_id'/reviews?ref_=tt_ql_7} *user reviews * url_mdb.headers {customheader=Accept-Encoding=gzip,deflate} end_scope * scope.range {(match)|end} * imdb elements * possible mustmatch elements * episodetitle (sub-title) mdb_episodetitlelist.scrub {multi()|p1|||} * episode-num mdb_episodenumlist.scrub {regex(pattern="'S1'.'E1'")|p1||(.+?)||} * mdb_title.scrub {single(separator="(" include=first exclude="")|p1||||} * the original title mdb_title.scrub {single(separator="(" include=first)|p1||||} mdb_title.modify {cleanup(tags="/=\"")} * removes starting " mdb_title.modify {cleanup(tags="\"=/")} * ** aka's not yet implemented if at all possible *mdb_title.scrub {multi(separator=" - ")|p3|
Also Known As (AKA)
|\n||} *aka's *mdb_title.scrub {multi|p3|
Also Known As (AKA)
|\n||} *aka's * mdb_temp_6.scrub {regex()|p1||\s+?(||} * all the episodes * end_scope * scope.range {(getelements)|end} * in case of matched subtitle mdb_temp_1.modify {calculate('mdb_episodetitlelist' not "" type=element format=F0)|'mdb_episodetitlelist' 'mdb_subtitle' @} * index of the episode * in case of matched episodenum mdb_temp_1.modify {calculate('mdb_episodenumlist' not "" type=element format=F0)|'mdb_episodenumlist' 'mdb_episode' @} * index of the episode mdb_temp_1.modify {substring(type=element)|'mdb_temp_6' 'mdb_temp_1' 1} * the episode in xml format * * elements from mdb_temp_1 (the episode) ** get the mdb_episode_id mdb_episode_id.modify {substring(type=regex)|'mdb_temp_1' "(\d\{7\})/\">"} * get the tt nbr for the episode * * the following elements are taken from the episode detail page mdb-p2 * there is a story line, director and actors , starrating, episodenum * also a 'full synopsys' on a separate page mdb-p3 ?? * ** full productiondate mdb_productiondate.scrub {single()|p2|} mdb_productiondate.modify {calculate(format=productiondate)} * only year allowed! mdb_category.scrub {regex()|p2||(.+?)||} mdb_actor.scrub {multi()|p4|?ref_=ttfc_fc_cl_t|itemprop="name">||} mdb_director.scrub {multi|p4|?ref_=ttfc_fc_dr|" >||} mdb_starrating.scrub {single()|p2|
|itemprop="ratingValue">||
} mdb_starratingvotes.scrub {single|p2|
|based on|user ratings|
} mdb_showicon.scrub {single|p2|Poster"|src="|"|"image" />} mdb_commentsummary.scrub {multi(exclude="SPOILERS ARE INCLUDED""This review may contain spoilers""Add another review" include=first)|p6||

|

|Add another review} mdb_review.scrub {multi(exclude="SPOILERS ARE INCLUDED""This review may contain spoilers""Add another review" include=first)|p6|
|

|

\n\n|Add another review} mdb_plot.scrub {single(separator="Storyline|

|

|} mdb_description.scrub {regex|p2||||} * * subtitle when not already done with episodetitlelist mdb_subtitle.modify {substring("" type=regex)|'mdb_temp_1' "
(.+?)"} * episode must be last because it is used to get mdb_temp_1 (the actual episode data from mdb_temp_6) mdb_episode.modify {clear} loop {('mdb_episode' "" max=1)|end} mdb_episode.modify {substring(type=regex)|'mdb_temp_1' "(.+?)"} mdb_temp_3.modify {substring(type=regex)|'mdb_episode' "\.(\d*)"} * episode part mdb_episode.modify {substring(type=regex)|'mdb_episode' "(\d*)\."} * the season part * onsceen format mdb_episode.modify {addstart(not "")|S} mdb_episode.modify {addend('mdb_temp_3' not "")|E'mdb_temp_3'} * convert to xmltv_ns *mdb_temp_3.modify {calculate(not "" format=F0)|1 -} *mdb_episode.modify {substring(type=regex)|'mdb_episode' "(\d*)\."} * the season part *mdb_episode.modify {calculate(not "" format=F0)|1 -} *mdb_episode.modify {addend()|.'mdb_temp_3'.} end_loop end_scope