Ok, trying to get this to work but need some guidance. What I want to achieve is to get some additional data for movies from IMDB by using mdb, and then use REX to reformat my gude.xml to better suit the requirements of Windows Media Center. Two challenges so far:
1) When i run Mdb I the resulting guide.xml is identical as before, and I get some error messages in the logfile (see attachment)
2) I would like to first grab the TV guide, then run MDB and create a new file, and last run REX to process the resulting file from MDB. It is not clear to me how to achieve the last goal, it seems like Rex picks up the original grabbed guide.xml instead of the one processed by mdb and placed in the mdb directory
Attaching all my config files and the logfile showing the error messages. I did not mess with the mdb-ini files so it's strange that I get those error messages. I will move some of the processing I do in the starhub-ini file to rex once I get a hang of it. Yhe idea is to keep the xmltv file "clean" and do the formatting for WMC separate.
Hi ,
I will have a look at your files and come back on that. But , to get you started :
- have a look at http://www.webgrabplus.com/documentation/configuration-mdb/mdbconfigxml
- You don't have to configure REX to use MDB. MDB uses REX by default automatically, the settings for it are a part of the mdb.config (allocation and presentation)
Jan
Hi Jan
Aha, that makes it clear, thanks for spelling it out for me :)
I did understand that the Mdb processing actually used rex, but I thought it was somehow limited to the "mdb processing".
My main problem tho is the error messages from the mdb preprocessor, can't figure that out and my ini-files are default.
Thanks!
Christer
Sorry, I still don't get it. I'm currently doing this in rex.config.xml:
Can I do the same in mdb.config.xml and if so, where do I place this in the file?
From my understanding mdb.config.xml just gives examples of manipulating data from imdb.
And another thing, I exclude a lot of channel from mdb processing by using "<channel update ....", but I still want those channels processed as above (which I do by rex today), is that even possible?
Thanks and please bear with me!
Christer
You don't need rex.config when you run mdb. You must specify the 'allocation and presentation' (as in rex.config) settings in mdb.config. Please read the documentation http://www.webgrabplus.com/documentation/configuration-mdb/mdbconfigxml , there is a chapter about this at the end. These settings apply to all shows in the xmltv source file, whether they are affected by additional MDB data or not. You can also specify the channels you don't want processed by the mdb 'extra's' , but they will stil get the 'allocation and presentation' modifications (but in fact I am not fully sure about that, sorry. I will do a check)
Jan
OK thanks, I got that. But my main problem with mdb is that I get those error messages and no resulting xml file. Please have a look at my attachments in my original post. Thanks a lot!
Because REX is an integral part of MDB you don't need to start that in Webgrab++.config. Delete the REX line and keep only :
<postprocess run="y" grab="y">mdb</postprocess>
(when your setup is finished and stable you can change grab="n" into grab="y" to do the grab and postprocess in one run)
I have composed a mix (of your MDB and REX config) for the allocation and presentation that you can add to the end of mdb.config :
<title lang="xx">'mdb-title'</title>
<title>'title'</title>
<star-rating system="imdb">'mdb-starrating'</star-rating>
<review>{Viewers comments : 'mdb-commentsummary( ... )'}{\nRatings: 'rating(, )'.}</review>
<review type="text">{IMDb review: 'mdb-review'}</review>
<sub-title>{Episode: 'episode' }'subtitle'</sub-title>
<desc>'description'{\n\t¤ Produced in: 'productiondate'. }{¤ Category: 'category(, )'. }{\n\t¤ Actors: 'actor(, )'}{\n\t¤ Director: 'director(, )'}{\n\t¤ Presenter: 'presenter(, )'}</desc>
<credits></credits>
<episode-num></episode-num>
<date></date>
<category></category>
<rating></rating>
You can fine-tune that to your liking.
There are two remaining issues:
1. The starhub ini leaves some data between () at the end of the title . E.g (2008) or (S1). This disturbs the primary search and will lower the hit rate. I have adapted the ini . The productiondate is separated and removed from the title. Also the (S1) is removed, I don't know the meaning of that. Do you?
2. There is an issue when you exclude channels from MDB processing. They will (indeed) be excluded from REX processing. I am working to change that because that is obviously wrong. Until I have that fixed you have two options: Reduce the number of channels you grab to the ones in MDB. Or accept postprcessing for all the channels.
I am still looking to improve the series matching. For succesfull matching of that it is necessary that the source xmltv contains an episode title in the subtitle. I will come back on that.
Jan
Thank you very much. Have not tried this yet, in the mean time I adjusted a plain rex confirguration to my liking. Basically just added the episode number to the subtitle and added the actors to the description.
You say you adapted the ini, you mean you updated the Starhub ini file? If that's the case then I can't see it on your download page, in fact I don't see the updates you did for me either which removed some duplicates in francis latest ini-files for Starhub. Anyhow I think it's a good idea to remove the production year from the title. The (S1) or (S2) that you occationally see I'm pretty sure means the season number.
I'll experiment some more with mdb this weekend, it's indeed a very interesting feature. And if not for anything else, I'm pretty much a perfectionist so if my TV guide can be perfect I would like it to be :)
Anyhow, please advice on the revised ini-file that removes the () behind the title in starhub channels.
On and one more thing. After I removed the rex processor from the mail config file things started to work, but I still get all thse annoying error messages as described in my first post.
Thanks again, much appreciated!
I did some more investigation and figured out the S addition. So that should go to the episode-num. Some channels also give the Ep number, so I will combine the two into the <episode-num>. Question: How do you want the episode? : S2 E5 or Season 2, Episode 5 or the xmltv_ns standard which is used in many PVR's 1.4. or else ?
Unfortunately, the site doesn't give episode titles, so the MDB will not be able to add episode data. You will have to do with just the general series data without anything of the episode in question. (IMDb has no lookup on Episode numbers)
I will put the updated ini online when I have it finished. Probably tomorrow.
Jan
Wow, great news, thanks :)
Given a choice I would prefer episode/season numbers on the xmltv_ns format. I would still have to convert it to use it with my current WMC, but I still prefer xmltv_ns as it's more flexible and "future proof". Since MS has pretty much abandoned their Media Center I think I'll switch to something else in the near future.
Thanks again!
Christer
The new starhub ini's are online.
If you want to try mdb :
1. in webgrab++.config specify the mdb postprocess as mentioned earlier.
2. in mdb.config specify
<site>imdb.com.ask,imdb.com.imdb</site>
<filename>guide.xml</filename>
<ldbfilename update="f">mdb.xml</ldbfilename>
<selectmovie duration="55" minumum="2" musthave="title" contains="" optional="productiondate,actor,director" />
<selectserie duration="25" minumum="2" musthave="title,subtitle" contains="" optional="productiondate,actor,director" />
You can also just disable or remove the selectserie line because there won't be any selected serie (no subtitle in the xmltv source with starthub)
<matchmovie mustmatch="title" optional="productiondate,actor,director" minimum="2"/>
<matchserie mustmatch="title,subtitle" optional="actor,director" minimum="2"/>
Same here , you can disable the matchserie line or remove it.
After this add the allocation and presentation setting you already figured out.
Start with just a few channels in the webgrab config . Good candidates are the HBO, Cinemax and TCM channels.
If everything runs properly, the first time the IMDb matching takes a few seconds / show. If you run the same again everything goes a lot faster because the matched shows are already in the ldb file and will be matched from there.
As soon as I have some time I will solve the remaining issue with the excluded channels not being processed by the REX component. But that might take some time.
good luck Jan
Thanks, really appreciated. I'll experiment with this today.
I was a bit confused at first because the date of the ini file on your site didn't change, and the header was not updated with the latest revision, but I see that it's indeed updated.
Thanks again!
Christer
I have send you some material by mail.
Jan
Sorry for the lack of feedback, I've been real busy with other things lately. But I did some testing last Sunday and indeed all the channels got processed by rex, so my config was applied to those channels I disabled from rex processing. But I experienced some problems:
1) I get lots of duplicates in m guide.xml
Example 1:
<programme start="20140307023000 +0800" stop="20140307032500 +0800" channel="beTV">
<title lang="en">Camelot</title>
<sub-title>Episode: 5 Episode: 5 </sub-title>
<desc lang="en">Camelot re-tells the tale of how a very much carefree boy, Arthur, will fulfill his destiny and rise up to become King Arthur</desc>
<desc lang="en">Camelot re-tells the tale of how a very much carefree boy, Arthur, will fulfill his destiny and rise up to become King Arthur</desc>
<desc lang="en">Camelot re-tells the tale of how a very much carefree boy, Arthur, will fulfill his destiny and rise up to become King Arthur</desc>
<episode-num system="onscreen">5</episode-num>
</programme>
Example 2:
<programme start="20140302030500 +0800" stop="20140302062000 +0800" channel="HBO">
<title lang="en">The Godfather (Part II)</title>
<desc lang="en">This brilliant piece continues the saga of two generations of successive power within the Corleone family. With Al Pacino and Robert De Niro</desc>
<desc lang="en">This brilliant piece continues the saga of two generations of successive power within the Corleone family. With Al Pacino and Robert De Niro
¤ Actors: Al Pacino, Robert De Niro, Robert Duvall</desc>
<credits>
<actor>Al Pacino</actor>
<actor>Robert De Niro</actor>
<actor>Robert Duvall</actor>
</credits>
</programme>
2) It took a very long time to do the the MDB processing, unfortunately I forgot to save the logfile. Some shows took only 0.03 seconds or something like that to process, but some took 300+ seconds.
3) It seemed like it was difficult to get a match on IMDB for asian shows/movies, eg shows with asian names. Not sure if you are aware but in Asia it's common to write your name with the last name first, then your first name, on the form "lastname (middel names) first name". On IMDB it seems like even asian names are written the uasual way "first name (middle names) last name". But, in the starhub TV Guide (and I guess most TV guides in asia), it's written the "asian way" (lastname firstname). I think that's why I got no matches for asian movies/shows with actors and listings on IMDB, my guess is that the "actor name" match failed because of this.
Ubnfortunately I forgot to save the errorlog or I would have attached it here. I can do another grab later if you want to see the error log. But the next 3 weeks I'll be away on a business trip. I'm pretty happy with my current setup of using rex to just replace subtilte with episode number and adding the actors to the movie description. But I'll experiment some more with MDB when I'm back.
Thank you for all your support, much appreciated. I would be in major trouble at home if I didn't get WMC up and running again after the previous setup with TvXB broke down because of site changes, and WebGrab+ saved the day. Thanks! :)
Hi,
when you are back send us your mdb.config.
re 1 : I think the multiple desc elements can be eleminated by a change of the mdb.config
re 2 : A new matching normally takes a few seconds, if no match is found a 'deep' search starts which can take a while, maybe even 300 seconds, although I have never experienced it to take that long. The results are stored in ldb so that , normally, the next time the same show needs matching the results in the ldb are taken which takes the 0.03 seconds or something you noticed.
re 3 : I know about the Asian customs with names and dates, most significant first. (I lived in Japan for a few years). The name matching algoritme in WG++ obviously weighs the last name higher , so indeed it is important to know how the name is constructed. I like t do some experiments , give me the channel entries if you are back .
Jan
I'll do another grab with the "Mdb configs" and attach logfile and my config files as soon as I'm back, but it's gonna be another 3+ weeks.
Hello,
I have been trying to get the imbd to work with data retrieved from 'yourtv.com.au.ini'. Similar to the topics discussed in this thread I have encountered a few problems.
My objective was to use the imbd extra data to give me firstly an episode number that I could place at the end of the description. Any other further data like date or actors would be a bonus.
Ideally I was hoping to generate a format of s"*"e"*" 's' being season ‘e’ episode, e.g. s1e9. Reading through other posts and the manual I may have to settle with just the episode, yet you did hint at both above. I want to achieve this for season and episode searches on my PVR to avoid recording repeats. e.g. don't record a program with s4e"The PRV wildcard".
I have configured the yourtv source and it gives a good description for both movies and series programs. Yet the series programs or titles with episodes can be limited to almost just the title, subtitle and description.
This is where I encounter the problems I am having:
Due to the lack of info in my source I believe I am restricted to only using a selectserie search of just the title and subtitle to get a reliable match.
<selectserie duration="25" minumum="2" musthave="title,subtitle" contains="" optional="" />
This gave a result of 0 hits. I can put in 'optional' entries and get hits, but the matchserie still will have the same result of 'no matches!' after searching.
Starting MDB Postprocess
Selecting movie and serie candidates:
Found 126 movie - and 0 serie candidates.
This is strange due to the source having many entries both with title and subtitle.
<title lang="en">Family Guy</title>
<sub-title lang="en">Peter Problems</sub-title>
<desc lang="en">When Peter finds he can't perform his, um, er, uh, manly duties, he enlists Quagmire and Joe to help him get his mojo back</desc>
I have included the config and ini files I am working with, any ideas or advice for getting a search result and ultimately a season*episode* format would be greatly appreciated.
Regards, S.
Hi Shademaker,
There are a few steps to take to get what you want.
The first ofcourse is to get the selection working. From what I see in your MDB.config and the 0 matches you get I guess you are working with an older version of WG++. (there was no logfile to verify that) With the latest version it sould work but older versions give no results when contains="" , try contains=" " (a space). But even bettter upgrade to the latest beta http://www.webgrabplus.com/sites/default/files/patchexe_prebuild.zip This contains MDB postprocessor 1.3 which has additional functionality for series episode data. See the artical on the home page http://www.webgrabplus.com/content/series-episode-data-added-functionality-mdb-postprocessor.
As you will notice, for this version of MDB there is a dedicated mdb.ini for series. That, together with that beta should match serie episodes with the title and the subtitle in the xmltv source. Please read the mdb configuration documentation http://www.webgrabplus.com/documentation/configuration-mdb/mdbconfigxml.
To get you started set minumum="10" in the movie select, to a kind of disable. So don't bother on the movies to start with.
Tomorrow I will try your source guide and come back with more .. and please include a log file next time.
Jan
Hi,
try the attached mdb.config. As a start it does the series only. You can add the movies later if you are satisfied with the series.
The select series is limited by specifying the catagories that have a chance in IMDb , I have chosen contains="Drama,Mystery,Crime,Children,Thriller,Family". You can change that to your liking.
You will need the imdb serie ini http://www.webgrabplus.com/sites/default/files/download/ini/info/SiteIni.Pack/MDB%20postprocessor/imdb.com.imdb_series.ini
This returns the episode in xmltv_ns syntax. If you don't want it that way it is easy to change. Just tell me.
The xmltv_ns looks like this [Episode: 3.5.] and the default IMDb like this (for the same episode) [Episode: Season 4, Episode 6]
The allocations in the sample mdb.config places the episode at the end of the description. I also added the starrating there. Ofcourse you can put them anywhere else.
A remark : Matching the xmltv title and subtitle with the IMDb title and episode title as the MDB postprocessor tries, remains imperfect. Many small differences can cause the match to fail, despite the algoritms used that allow for a lot of those differences. So you will notice that some of the series simply fail to match.
Jan
Hello Jan,
Thanks for the replies and looking into my problem. I have now followed your instructions and updated to the beta version as well as updated the mbd files with the ones supplied in your post.
The mbd processor is now matching series, after a change in the mdb.config on line 184, }} at the end of the line instead of ]].
It is also writing the successful matches in the mbd guide output as eg. Episode: 2.14 format but the series matches are given incorrectly.
An output Episode 2.7 of a show is actually when checked online 3.8
An output Episode 0.5 should be 1.6 etc.
I do prefer the ‘s1e3’ format, if you could change that I would appreciate it, I could not work it out myself .
Thanks for your help with this great program, I owe you a beer.
Regards, Sean.
Hello Sean,
The Episode 2.7. in xmltv_ns format is actually correct for a season 3, episode 8 . This xmltv_ns format is 'zero' based, so season 1 is '0' . I understand that that is confusing, and I personally find the xmltv_ns format a silly one. But it is used by most of the PVR equipement and software, so WG++ supports it.
I made a variant of the imdb.com.imdb_series.ini that returns the episode in the format you prefer. http://www.webgrabplus.com/sites/default/files/download/ini/info/SiteIni.Pack/MDB%20postprocessor/imdb.com.imdb_series.onscreen.ini
It is called imdb.com.imdb_series.onscreen.ini. To use this do the following:
1. Download the new ini and place it in the mdb subfolder
2. Rename the ini filename in mdb.config.xml
From:
<site movies="imdb.com.ask,imdb.com.imdb" series="imdb.com.imdb_series" />
To:
<site movies="imdb.com.ask,imdb.com.imdb" series="imdb.com.imdb_series.onscreen" />
3. The local mdb.xml database that is already filled with matches from earlier runs now contain the previous episode format. The mdb postprocessor will automatically take these if they match with a selected show in the xmltv input file. Thus the matches already stored in this will keep the old episode format unless you remove all these entries from mdb.xml. The easiest is just to start with a fresh mdb.xml file, you can delete it or keep it under another name. Eventually you can also hand delete all shows with the wrong episode format , like the ones with something like this <episode-num>1.17.</episode-num>.
A beer will be a bit difficult but feel free to use an alternative http://webgrabplus.com/content/support-us
Jan
Jan,
I agree the xmltv_ns format for episode numbers is quite strange, thanks for making the new imdb.com.imdb_series.onscreen.ini. It works very well and can be changed easily for anyone’s personal preference.
To help the imdb.com.imdb_series.onscreen.ini to match shows with similar but different titles in the source guide I made a small addition to the yourtv.com.au.ini.
*Change titles manually for a successful match in mdb. (must run a "f" update to apply changes.)
*Format: title.modify {replace|title in guide.xml|The title at the mbd site}
*
title.modify {replace|NCIS|NCIS: Naval Criminal Investigative Service}
title.modify {replace|NCIS: Naval Criminal Investigative Service: Los Angeles|NCIS: Los Angeles}
It is not as elegant as your work but it created a successful match with only a few warnings in the log. I'm still trying to absorb the documentation maybe I'll come up with something nicer to look at next time.
I also found after reading your short example in the website documentation adding all your favourite show titles to the contains field works well too.
<selectserie contains="" duration="20" minumum="2" musthave="title,subtitle"
Thanks again, Sean.