Hey.
Well, the srt looks like this: http://www.wikiuploa...e.php?id=201114 .
It has commercials and lines... Lines that must correspond to the times of the italian version here: http://www.addic7ed....u_a_Taxi,_Honey . I can't think of a possible way to do this automatically, but if you can find a solution, it will be amazing :)
Thanks.
Help needed - copy paste
Started by honeybunny, Nov 06 2009 05:55 PM
28 replies to this topic
#22
Posted 09 December 2009 - 11:16 PM
In September 2008, (Intro HIMYM style :P)
I make a app witch does exactly that...
(Pass the text of one sub using the times of another sub with different number of lines)
I know it has some bugs, mainly because I was starting programming/lack of time
(and i didn't touch it since then xD) but works pretty well.
Although it's in Spanish, and have a lot text in the many options it has...
btw, it has thousands of line code, so I'd like too see that "zero time" xD
I send you a email tomorrow with it, ok?
ByeZ
PS: I think chamallow already have this app...
I make a app witch does exactly that...
(Pass the text of one sub using the times of another sub with different number of lines)
I know it has some bugs, mainly because I was starting programming/lack of time
(and i didn't touch it since then xD) but works pretty well.
Although it's in Spanish, and have a lot text in the many options it has...
btw, it has thousands of line code, so I'd like too see that "zero time" xD
I send you a email tomorrow with it, ok?
ByeZ
PS: I think chamallow already have this app...

[Kerensky] Transcript Annotations Cleaner v26-12-2010
[Kerensky] Automatic Subtitle Synchronizer v12-01-2010
#23
Posted 09 December 2009 - 11:19 PM
Thanks honeybunny but what I meant is I needed to see the format of the *transcripts* themselves!
As far as I understand, you have a plaintext file with the lines spoken in the show, but no timing information. And you have a working subtitle in a language that is not english, but obviously has the correct timing information.
So I believe that what you're asking people to do is to take the transcripts, 'copy' the foreign-language subtitle and replace the text for each sequence with the one in the transcript.
I needed to see one of the transcripts to 'get a feel' of the problems involved in possibly automatizing the process (so that even tough it might take a little while to 'prepare' now, can be reused in the future). Some of the problems I might encounter are:
So, in short terms, what I really needed to see now is one of the PLAIN transcripts you received so that I can compare to the portuguese files with correct timing you mentioned before!
Did I make myself clear now? =D
You probably chose the "wrong" language then =D I built a similar app in Python that I'd adapt to do this particular task, and I guess it would never be over 100 lines, probably stay around 50-60!
As far as I understand, you have a plaintext file with the lines spoken in the show, but no timing information. And you have a working subtitle in a language that is not english, but obviously has the correct timing information.
So I believe that what you're asking people to do is to take the transcripts, 'copy' the foreign-language subtitle and replace the text for each sequence with the one in the transcript.
I needed to see one of the transcripts to 'get a feel' of the problems involved in possibly automatizing the process (so that even tough it might take a little while to 'prepare' now, can be reused in the future). Some of the problems I might encounter are:
- It is difficult for a computer to figure out where one 'spoken line' ends in the transcript file.
- The lines are too long, requiring that one 'splits' them while pasting. (what is solvable anyway)
- The lines in the transcript don't map directly to those in the file. If a human has to watch the video to figure out where each line goes in the subtitle, then it's probably unpractical to try and script it.
So, in short terms, what I really needed to see now is one of the PLAIN transcripts you received so that I can compare to the portuguese files with correct timing you mentioned before!
Did I make myself clear now? =D
Quote
In September 2008, (Intro HIMYM style :P)
I make a app witch does exactly that...
(Pass the text of one sub using the times of another sub with different number of lines)
I know it has some bugs, mainly because I was starting programming/lack of time
(and i didn't touch it since then xD) but works pretty well.
Although it's in Spanish, and have a lot text in the many options it has...
btw, it has thousands of line code, so I'd like too see that "zero time" xD
I send you a email tomorrow with it, ok?
ByeZ
PS: I think chamallow already have this app...
I make a app witch does exactly that...
(Pass the text of one sub using the times of another sub with different number of lines)
I know it has some bugs, mainly because I was starting programming/lack of time
(and i didn't touch it since then xD) but works pretty well.
Although it's in Spanish, and have a lot text in the many options it has...
btw, it has thousands of line code, so I'd like too see that "zero time" xD
I send you a email tomorrow with it, ok?
ByeZ
PS: I think chamallow already have this app...
You probably chose the "wrong" language then =D I built a similar app in Python that I'd adapt to do this particular task, and I guess it would never be over 100 lines, probably stay around 50-60!
#25
Posted 10 December 2009 - 08:26 AM
Quote
The lines in the transcript don't map directly to those in the file. If a human has to watch the video to figure out where each line goes in the subtitle, then it's probably unpractical to try and script it.
This pretty much is always a deal breaker :P
But, if you do a dictionary based search for pairing the text (and the input text has some kind of order), it maybe don't be so impossible.
Good luck with that!
Quote
You probably chose the "wrong" language then =D I built a similar app in Python that I'd adapt to do this particular task, and I guess it would never be over 100 lines, probably stay around 50-60!
Well, I'll be really impress (I mean it!) if you can do this:
"Pass the text of one sub using the times of another sub with different number of lines". (Obviously formating the text (split/join) in the travel), with only 50-60 lines.
I did it using C#... I know with Python the code are usually shorter,
but some much shorter? xD

[Kerensky] Transcript Annotations Cleaner v26-12-2010
[Kerensky] Automatic Subtitle Synchronizer v12-01-2010
#26
Posted 10 December 2009 - 12:04 PM
@Kerensky, I understand the complexity of your program now! Sorry for "underestimating" you =D
Obviously it probably can't be done in 60 lines, but I'd be interested in chatting with you via IM to know more about your program... would that be possible? =D
Obviously it probably can't be done in 60 lines, but I'd be interested in chatting with you via IM to know more about your program... would that be possible? =D
#27
Posted 10 December 2009 - 12:27 PM

[Kerensky] Transcript Annotations Cleaner v26-12-2010
[Kerensky] Automatic Subtitle Synchronizer v12-01-2010
#28
Posted 10 December 2009 - 11:50 PM
that`s the spirit, promote the channel! :D
[/offtopic]
[/offtopic]
It's time to kick *** and chew bubble gum...
#29
Posted 18 December 2010 - 04:08 PM
By any chance, when will Cupid 01x07 - My Fair Masseuse will have subtitles in English ?
We have Spanish, Portuguese and Italian
Thanks in advance.
We have Spanish, Portuguese and Italian
Thanks in advance.
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users














