You are here

mail.ru ini request

27 posts / 0 new
Last post
JohnnyParanoia
Offline
JohnnyParanoia's picture
Joined: 10 years
Last seen: 1 year
mail.ru ini request

Me again lol.
Wondered if someone would be able to create an ini for http://tv.mail.ru/ 
Its a Russian tv guide which would be an excellent addition to WG+ and seems to have all the channels.
Thanks in advance :)

WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 11 years
Last seen: 7 min
Is the support helpful?
support us

I will have a look
 
Jan

JohnnyParanoia
Offline
JohnnyParanoia's picture
Joined: 10 years
Last seen: 1 year
WGMaker wrote:

I will have a look
 
Jan

Thank you :)

WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 11 years
Last seen: 7 min
Is the support helpful?
support us

Hi,

There were a few small complications with tv.mail.ru :

  • The index page needed to be sorted: The shows on it were not in tvguide time order
  • Also the start and stop times on it are two hours earlier than the ones on the site's epg pages ? I assued the later ones are correct and did a correction.
  • I assumed the timezone of Moscow.
  • The channel list file was difficult to create. Preferably without cyrillic charaters in the xmltv_id because the console cannot display them . I didn't completely succeed with this  .. some channels have some cyrillic characters.

Before I place it in the collection I like you to test it.

 

Jan

JohnnyParanoia
Offline
JohnnyParanoia's picture
Joined: 10 years
Last seen: 1 year

Had a quick check and looks good :)
I tried 4 channels - Sony Sci-Fi, Paramount Channel, TV1000 and TV1000 Action and they all scraped and displayed OK and the timing offset seems to be right (I'm in the UK)
Excellent job :)
Thank you.

WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 11 years
Last seen: 7 min
Is the support helpful?
support us
needz
Offline
Donator
Joined: 9 years
Last seen: 4 months

Jan,
After I grabbed one channel from mail.ru, it now throws this error constantly:
error downloading page: The remote server returned an error: (429)
Seems like they are blocking the requests.. can you check?

WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 11 years
Last seen: 7 min
Is the support helpful?
support us

Hi,

 

@ JohnnyParanoia : Is you computer set to the 'Russian' Windows version? If so, when you grab a channel with cyrillic characters in the xmltv_id, like:

<channel update="i" site="tv.mail.ru" site_id="1750" xmltv_id="Первый лучшее">Первый лучшее</channel>

is the channel properly displayed in the console output?

 

@ needz : I don't have that issue! I tried 10 channels for 9 days and all were nicely grabbed, even rather fast. What <user-agent> do you use in the config?
Sent me your log and config files, maybe I see something.

 

Jan

needz
Offline
Donator
Joined: 9 years
Last seen: 4 months

Jan,
User agent is: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; yie9)
It seems that it starts to grab one channel just fine, then has problems.. check the log.. all other sites work just fine with my configs..
 
update requested for - 3 - out of - 3 - channels for 3 day(s)

update mode - set per individual channel

 

i=index  .=same  c=change  g=gab  r=replace  n=new 

 

egoist updating, using site TV.MAIL.RU, mode incremental

iii............

 

lifenews updating, using site TV.MAIL.RU, mode incremental

iiinnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn

error downloading page: The remote server returned an error: (429) .

pausing 1 of 4 times for 5 seconds before re-try.

Attachments: 
JohnnyParanoia
Offline
JohnnyParanoia's picture
Joined: 10 years
Last seen: 1 year
WGMaker wrote:

Hi,
 
@ JohnnyParanoia : Is you computer set to the 'Russian' Windows version? If so, when you grab a channel with cyrillic characters in the xmltv_id, like:
<channel update="i" site="tv.mail.ru" site_id="1750" xmltv_id="Первый лучшее">Первый лучшее</channel>
is the channel properly displayed in the console output?
 

 
No it's set to UK English and display's ?????? ??????? updating, using site Mail.RU, mode incremental. The channel name and content display perfectly in the resulting xml though.
 

WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 11 years
Last seen: 7 min
Is the support helpful?
support us

@ Johnny  : Same here, it was just a question. Thanks
 
@needz : see attached. No clue as yet. What WG++ version do you use? Windows or else?
 
Jan

Attachments: 
needz
Offline
Donator
Joined: 9 years
Last seen: 4 months

Jan,
My version is WebGrab+Plus/w MDB & REX Postprocess -- version 1.54.6/0.01 on Debian linux using mono.
Just tried to capture it again and again getting the same error. Could you try it on Linux somehow? Thanks.

WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 11 years
Last seen: 7 min
Is the support helpful?
support us

I cannot try it on Linux because I don't have a machine running it. Maybe Francis has time to do that.
Can you try it with the latest beta http://www.webgrabplus.com/sites/default/files/patchexe_prebuild.zip ?
 
Jan

needz
Offline
Donator
Joined: 9 years
Last seen: 4 months

Jan, thanks, I tried this one WebGrab+Plus/w MDB & REX Postprocess -- version 1.1.1/55.05
But still the same, unfortunately.
 

         WebGrab+Plus/w MDB & REX Postprocess -- version 1.1.1/55.05

 

                           Jan van Straaten

                         Francis de Paemeleere

 

        many thanks to Paul Weterings and all the contributing users

        ------------------------------------------------------------

 

processing epgwg.xml ........

update requested for - 3 - out of - 3 - channels for 3 day(s)

update mode - set per individual channel

 

 

      i=index  .=same  c=change  g=gab  r=replace  n=new 

 

egoist updating, using site TV.MAIL.RU, mode incremental

iii............

 

lifenews updating, using site TV.MAIL.RU, mode incremental

iiinnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn

error downloading page: The remote server returned an error: (429) .

pausing 1 of 4 times for 5 seconds before re-try.

 

wetred
Offline
Donator
Joined: 7 years
Last seen: 7 months

is it possible to make the ini file for https://tv.mail.ru on it a number of channels which have no other ?

majer_alex
Offline
Donator
Joined: 8 years
Last seen: 1 year

Добрый день, mail.ru сделал защиту от парсера, выглядит так:

1 канал собирает без проблем, а второй и последующие уже нет, выдает вот такое сообщение:

Group (0) :
update requested for - 1 - out of - 1 - channels for 7 day(s)
( 1/1 ) TV.MAIL.RU -- chan. (xmltv_id=728) -- mode Incremental
iiiiiii
Unable to update channel 728
Generic syntax exception:
message:
no index page data received from 728
unable to update channel, try again later
Existing guide data restored!

Job finished at 20/11/2018 10:05:55 done in 2s

Далее если исключить 1 канал и обновлять второй канал через несколько минут, то обновление проходит.
Возможно сделать чтобы можно было задать uptime между подключениями? примерно вот так:

channel update="i" site="tv.mail.ru" site_id="2310" xmltv_id="728">Дикая Охота HD channel
uptime 360
channel update="i" site="tv.mail.ru" site_id="2309" xmltv_id="667">Дикая рыбалка HD channel
uptime 360
channel update="i" site="tv.mail.ru" site_id="2407" xmltv_id="654">Тайна channel
uptime 360
channel update="i" site="tv.mail.ru" site_id="2427" xmltv_id="780">Романтичное HD channel

Я могу сделать множество дублей WebGrab++.config.xml с 1 каналом в каждом WebGrab++.config.xml, но хотелось бы как в примере выше.

Google translit:

Good afternoon, mail.ru made protection against the parser, looks like this:

Channel 1 collects without problems, and the second and subsequent ones no longer exist, it issues the following message:

Next, if we exclude the 1 channel and a second channel to update a few minutes, the update passes.
Is it possible to make it possible to specify uptime between connections? something like this:

I can do a lot of WebGrab ++. Config.xml duplicates with 1 channel in each WebGrab ++. Config.xml, but I would like to use the example above.

majer_alex
Offline
Donator
Joined: 8 years
Last seen: 1 year

у меня ревизия файла от 15,05,2016

I have a file revision from 15,05,2016

majer_alex
Offline
Donator
Joined: 8 years
Last seen: 1 year

thank

majer_alex
Offline
Donator
Joined: 8 years
Last seen: 1 year

К сожалению все равно не работает.

Unfortunately it still does not work.

Restriction for IP from Russia?
Later I will try on the server in Germany, I will write about the results.

majer_alex
Offline
Donator
Joined: 8 years
Last seen: 1 year

Unfortunately, the problem remains, it works only for the first channel

toleeck
Offline
toleeck's picture
Donator
Joined: 5 years
Last seen: 2 years

Hello. Who can share an actual config for tv.mail.ru?

majer_alex
Offline
Donator
Joined: 8 years
Last seen: 1 year

Good afternoon, tell me how to combine the output and in .
Now it works like this:
title Молодой Папа
subtitle 9-я серия

but you need to do this:
title Молодой Папа 9-я серия

that is, the variable is added to

Attachments: 
mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 13 hours

you can do it in rex postprocess add subtitle to title

majer_alex
Offline
Donator
Joined: 8 years
Last seen: 1 year
mat8861 wrote:

you can do it in rex postprocess add subtitle to title

If it doesn't bother you, give an example, please, I'm trying to figure it out myself.

majer_alex
Offline
Donator
Joined: 8 years
Last seen: 1 year

Unfortunately it doesn't work
< settings >
< filename >guide_test.xml< /filename >
< title >'title'\'sub-title'< /title >
< /settings >

at the output I get

< title lang=" ru ">Достать коротышку'sub-title'< /title >
< sub-title lang=" ru " >1-я серия< sub-title >

majer_alex
Offline
Donator
Joined: 8 years
Last seen: 1 year

It worked like this

< title >'title' 'subtitle'< /title >

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 13 hours

good !!

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl