You are here

Duplicated text in Sub-title field in WG++ output from Radio Times data

6 posts / 0 new
Last post
londc3
Offline
Donator
Joined: 6 months
Last seen: 2 weeks
Duplicated text in Sub-title field in WG++ output from Radio Times data

I've noticed that (as far as I can tell) all the entries that have a Sub-title in the WG++ output from Radio Times data have this field populated with the actual sub-title duplicated. It looks like:
- if the actual sub-title ends with question-mark or exclamation mark, the field is populated with 2 copies of the actual sub-title with a space between them.
- otherwise, it's the same 2 copies of the actual sub-title with a full-stop and a space between them.
(example of one of each of these cases below)

Is this a feature of the Radio Times data or of WG++?

Thanks,
Dan

WebGrab+Plus/w MDB & REX Postprocess -- version V5.2.0.0

WebGrab+Plus\siteini.pack\UK\radiotimes.com.ini -- Revision 31

Config file is (first 20-30 lines):
<settings>
<!-- for detailed info about the settings see http://webgrabplus.com/documentation/configuration/webgrabconfigxml
and http://webgrabplus.com/sites/default/files/downloads/Misc/Documented_Con... -->
<filename>guide.xml</filename>
<mode/>
<postprocess grab="y" run="n">mdb</postprocess>
<user-agent>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 Edg/91.0.864.59</user-agent>
<!-- to get access to sites which require your credentials -->
<credentials user="your username for the site" password="your password for the site">site name</credentials>
<!-- for siteini's that need a decrypt_userkey -->
<decryptkey site="site-name">decrypt_userkey</decryptkey>
<!-- add the correct license id values in the next line -->
<license wg-username="xxxx" registered-email="xxxxx" password="xxxxx">To force a license update; replace this text with the letter f</license>
<logging>on</logging>
<retry time-out="10">4</retry>
<!-- timespan 0 is one day -->
<timespan>14</timespan>
<update>i</update>
<!--
Replace the next dummy channel entry with the channels you want.
You can look into the installed siteini.pack folder on your computer

For the latest version,
see http://webgrabplus.com/epg-channels for the available sites/channels
or https://github.com/SilentButeo2/webgrabplus-siteinipack/tree/master/site...
or run SiteIni.Pack.Update.exe which is in the bin folder
-->
<channel update="i" site="radiotimes.com" site_id="htpp##7101c0db-8b33-59ad-9076-13a47591751b" xmltv_id="BBC One London">BBC One London</channel>
<channel update="i" site="radiotimes.com" site_id="htfy##286ac49c-6589-526e-91c0-1535bfa37c0d" xmltv_id="BBC Two England">BBC Two England</channel>
<channel update="i" site="radiotimes.com" site_id="hvv9##0ecfae8c-b3a1-58d3-94df-0b17a69712c9" xmltv_id="BBC Three">BBC Three</channel>
<channel update="i" site="radiotimes.com" site_id="htfc##b4c9bef9-ef51-5ed7-9a35-d197aef04d1f" xmltv_id="BBC Four">BBC Four</channel>

A couple of sample extracts from the output:

<programme start="20240620101500 +0000" stop="20240620111500 +0000" channel="BBC One London">
<title lang="en">Homes Under the Hammer</title>
<sub-title lang="en">A Tale of Three Semis. A Tale of Three Semis</sub-title>
<desc lang="en">Dion Dublin visits semi-detached houses in Stoke and Wirral and Tommy Walsh inspects a third semi in Gillingham, Kent, before an update on the changes the new owners have made.(n)</desc>
...

<programme start="20240621064000 +0000" stop="20240621070500 +0000" channel="Channel 4">
<title lang="en">Everybody Loves Raymond</title>
<sub-title lang="en">Why Are We Here? Why Are We Here?</sub-title>
<desc lang="en">When Debra gets fed up with Ray's family intruding in her life, they both reminisce about how quiet life used to be in their old apartment.(n)</desc>
...

Blackbear199
Offline
Blackbear199's picture
WG++ Team memberDonator
Joined: 9 years
Last seen: 4 hours

fixed

Attachments: 
londc3
Offline
Donator
Joined: 6 months
Last seen: 2 weeks

Thanks for fixing this. I recently upgraded to 5.3 and the problem re-occurred. I realised that I would have to re-apply the fix you sent me so I have done that now. Can I also suggest that you replace the current version of that ini file on the web-site with the fixed version so that future upgrades include the fix?

Cheers,
Dan

Blackbear199
Offline
Blackbear199's picture
WG++ Team memberDonator
Joined: 9 years
Last seen: 4 hours

shud be alreadydone..have you tried a siteini.pack update?
any fix i do i send to mat8861 who pushes the changes to git.

londc3
Offline
Donator
Joined: 6 months
Last seen: 2 weeks

I copied the siteini from the website pack because I couldn't find the update utility. Next time I'll use the update utility.

Thanks,
Dan

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 14 hours
londc3 wrote:

I copied the siteini from the website pack because I couldn't find the update utility. Next time I'll use the update utility.
Thanks,
Dan

In bin or bin.net you have the siteini pack update, in windows it's under C:\Program Files (x86)\WebGrab+Plus\bin

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl