I've noticed that (as far as I can tell) all the entries that have a Sub-title in the WG++ output from Radio Times data have this field populated with the actual sub-title duplicated. It looks like:
- if the actual sub-title ends with question-mark or exclamation mark, the field is populated with 2 copies of the actual sub-title with a space between them.
- otherwise, it's the same 2 copies of the actual sub-title with a full-stop and a space between them.
(example of one of each of these cases below)
Is this a feature of the Radio Times data or of WG++?
Thanks,
Dan
WebGrab+Plus/w MDB & REX Postprocess -- version V5.2.0.0
WebGrab+Plus\siteini.pack\UK\radiotimes.com.ini -- Revision 31
Config file is (first 20-30 lines):
<settings>
<!-- for detailed info about the settings see http://webgrabplus.com/documentation/configuration/webgrabconfigxml
and http://webgrabplus.com/sites/default/files/downloads/Misc/Documented_Con... -->
<filename>guide.xml</filename>
<mode/>
<postprocess grab="y" run="n">mdb</postprocess>
<user-agent>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 Edg/91.0.864.59</user-agent>
<!-- to get access to sites which require your credentials -->
<credentials user="your username for the site" password="your password for the site">site name</credentials>
<!-- for siteini's that need a decrypt_userkey -->
<decryptkey site="site-name">decrypt_userkey</decryptkey>
<!-- add the correct license id values in the next line -->
<license wg-username="xxxx" registered-email="xxxxx" password="xxxxx">To force a license update; replace this text with the letter f</license>
<logging>on</logging>
<retry time-out="10">4</retry>
<!-- timespan 0 is one day -->
<timespan>14</timespan>
<update>i</update>
<!--
Replace the next dummy channel entry with the channels you want.
You can look into the installed siteini.pack folder on your computer
For the latest version,
see http://webgrabplus.com/epg-channels for the available sites/channels
or https://github.com/SilentButeo2/webgrabplus-siteinipack/tree/master/site...
or run SiteIni.Pack.Update.exe which is in the bin folder
-->
<channel update="i" site="radiotimes.com" site_id="htpp##7101c0db-8b33-59ad-9076-13a47591751b" xmltv_id="BBC One London">BBC One London</channel>
<channel update="i" site="radiotimes.com" site_id="htfy##286ac49c-6589-526e-91c0-1535bfa37c0d" xmltv_id="BBC Two England">BBC Two England</channel>
<channel update="i" site="radiotimes.com" site_id="hvv9##0ecfae8c-b3a1-58d3-94df-0b17a69712c9" xmltv_id="BBC Three">BBC Three</channel>
<channel update="i" site="radiotimes.com" site_id="htfc##b4c9bef9-ef51-5ed7-9a35-d197aef04d1f" xmltv_id="BBC Four">BBC Four</channel>
A couple of sample extracts from the output:
<programme start="20240620101500 +0000" stop="20240620111500 +0000" channel="BBC One London">
<title lang="en">Homes Under the Hammer</title>
<sub-title lang="en">A Tale of Three Semis. A Tale of Three Semis</sub-title>
<desc lang="en">Dion Dublin visits semi-detached houses in Stoke and Wirral and Tommy Walsh inspects a third semi in Gillingham, Kent, before an update on the changes the new owners have made.(n)</desc>
...
<programme start="20240621064000 +0000" stop="20240621070500 +0000" channel="Channel 4">
<title lang="en">Everybody Loves Raymond</title>
<sub-title lang="en">Why Are We Here? Why Are We Here?</sub-title>
<desc lang="en">When Debra gets fed up with Ray's family intruding in her life, they both reminisce about how quiet life used to be in their old apartment.(n)</desc>
...
fixed