Post Thread  Post Reply 
 
Thread Rating:
  • 1 Votes - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Kerensky] Transcript Annotations Cleaner v21-02-2010
Author Message
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #16
RE: [Kerensky] Transcript Annotations Cleaner v10-12-09
New version: v10-12-09

(updated the download link of the 1st post)



New Features:
---------------------------------

- Now, instead of the 2nd/last text line, it will delete the 1st text line.
Inside Basic options, you can change to delete the 2nd/last text.
[Requested by honeybunny]

- Fixed some (big) errors, like you can see in the previous post.


- Now, the app can read srt with a bad formating of Text and Timecodes:

1) white lines between lines of text
2) White lines before lines of text
3) No white line after the text
4) Lines with empty text
5) No consecutive lines number
6) It will be almost immune to errors in the timecodes as seen below


INPUT:

1
00:00:00.334 --> 00:00:01.634
NARRATOR:

In November 2009,


3
00.00.03.367 --> 00,00.05.500

the first Thanksgiving
at their very own apartment.


4
0000,00.05,5 --> 00:0:8,367
And Marshall had found
the perfect turkey.


11
0:0:21,433 --> 0:0000:23,3
12
000.0:24,3 --> 00,00,26,934
So, when we showed up
for the big day,



OUTPUT: [without any option selected]

1
00:00:00,334 --> 00:00:01,634
NARRATOR:
In November 2009,


2
00:00:03,367 --> 00:00:05,500
the first Thanksgiving
at their very own apartment.


3
00:00:05,500 --> 00:00:08,367
And Marshall had found
the perfect turkey.


4
00:00:24,300 --> 00:00:26,934
So, when we showed up
for the big day,

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
(This post was last modified: 12-10-2009 05:15 PM by Kerensky.)
12-10-2009 04:50 PM
Find all posts by this user Quote this message in a reply
honeybunny Offline
addic7ive
*******

Posts: 270
Joined: Nov 2009
Reputation: 0

Thanks: 1
20 thank was given in 20 posts

MyMood: Wicked
Post: #17
RE: [Kerensky] Transcript Annotations Cleaner v10-12-09
great Big Grin

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
12-10-2009 07:43 PM
Visit this user's website Find all posts by this user Quote this message in a reply
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #18
RE: [Kerensky] Transcript Annotations Cleaner v16-12-09
New version: v16-12-09

(updated the download link of the 1st post)



New Features:
---------------------------------

- New Replace++ window:


1) Normal Replace: Change one sequence of characters for another. [requested by Alex]

[Image: replace1l.png]


2) Fix Names so the have the 1st letter in Uppercase and the rest in lowercase. The Names can be added:

- Manually:

[Image: replace2.png]

- From file: [requested by Verdikt]

[Image: replace3.png]

[Names should be always separated by: spaces, commas, semicolon, or line breaks.]

3) Fix the more common errors in the CC [requested by honeybunny]



- Basic Options: Add *** to the empty text lines in input, so the timming of these lines won't be lost.



- Basic Options: Now, you can select witch srt tag will delete the option:
"Erase all srt tags"
[requested by txu]



- Some fixing / improving:

Now, the smart split/join text for XX chars per line works great.

And the BATCH mode is now more natural to use.




As always, I hope you guys like it!, and, please, send me all the errors or ideas you think of.

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
(This post was last modified: 12-16-2009 10:56 PM by Kerensky.)
12-16-2009 10:14 PM
Find all posts by this user Quote this message in a reply
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #19
RE: [Kerensky] Transcript Annotations Cleaner v16-12-09
New version: v21-12-09

(updated the download link of the 1st post)


New Features:
---------------------------------

- Now, you can select the charset for:
The Input file, the Output file and the Stored Names file

Between one of: ANSI, UTF-8, UTF-32, Unicode
and AUTO
(in witch the app will try to detect automatically the charset used).



- Context menu in all the main window, with shortcuts to:
1) Input Charset
2) Output Charset
3) Replace window
4) Load Names from file tab



- New ( I didn't saw it in any other software) method to translate the app:

In Options -> Language, now there are: "Save current language" and "Load language"

"Save current language" will create a srt file with all text of the app,
so you can use any software you like to translate it.

"Load language" will load a previous saved srt language file and will update
all the text of the app with it. (It will appear as "Unknown language")

NOTE: DON'T TOUCH THE TIMES OF THE LINES when you translate the text.

I've added a few srt as language pack using google translator
(not great result, but still...)




- Added under "File" in the toolbar:
--- New (it will restart the app)
--- Recent Subtitles cleaned (It will only appear if those files still exists)



- Added: A few uncommon character to the Replace window:
Break Line, ¶, µ, ∞, λ, α, β, π, Ω, ∑, ≈
(some of them only will appear correctly if the output charset is not ANSI)



- New icons for the options (the reason of the big jump in the app size)



- Minor improvement / fixes
(More intuitive Replace window behavior, and a couple of bugs in reading the input srt and formatting the text witch only appeared using a subtitle with a bad srt format as input)



btw, the smart join/split text option is far, far better than this similar option in any other software.


I'm running out of cool ideas to add to the app, so, please, all suggestions are welcome.


EDIT: New quick version v22-12-09, now the loaded Custom language will be stored, so no more loading it each time.

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
(This post was last modified: 12-22-2009 03:17 PM by Kerensky.)
12-21-2009 02:49 AM
Find all posts by this user Quote this message in a reply
rogard Offline
Junior Member
**

Posts: 6
Joined: Nov 2009
Reputation: 0

Thanks: 0
0 thank was given in 0 posts

MyMood: None
Post: #20
RE: [Kerensky] Transcript Annotations Cleaner v22-12-09
I like the name feature (cheers verdikt) :-)

How about an "auto time compensation" feature:

Have a look at the total time of a subtitle line, then look at how many letters you have removed and adjust the resulting start/end time accordingly. (This is based on the fact that every letter has to stay a certain amount of time on screen to be read properly, so without compensation the subs will be too early if there is some descriptive stuff before the dialog, and much too long if something follows.)

Example:

INPUT:

1
00:00:00,000 --> 00:00:04,300
[WOMAN laughing in the distance]
[HARRY] So wassup?

OUTPUT:

1
00:00:03,200 --> 00:00:04,300
So wassup?

The same could happen if the HI parts are at the end of the subtitle. It will never be exact but in my opinion it would be better than nothing. A slider to adapt this feature to the user's needs would be nice too.

That'd be awesome...

Other ideas:
Automatic addition of leading dashes if there is only one dash for the second speaker. I like it when both speakers in a subtitle have leading dashes.

so this:

Hello.
- Hi.

becomes:

- Hello.
- Hi.

Oh, and you could add a feature to remove exclamations like Hahaha, Ouch, Oof, Ah, Erm etc.
(A list where new words can be added of course, just like the list of names.)

Keep it up. :-)
(This post was last modified: 12-23-2009 04:33 PM by rogard.)
12-22-2009 10:18 PM
Find all posts by this user Quote this message in a reply
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #21
RE: [Kerensky] Transcript Annotations Cleaner v22-12-09
Thank you for the suggestions rogard!

For the remove exclamations features, i will add a couple of tabs in the replace windows.

And for the Automatic addition of leading dashes if there is only one dash for the second speaker, I will add it to the basic options.


The "auto time compensation" feature will take a little longer, but definitely I could do something about it.
I guess as long as the deleted text is at the begging or at the end, shouldn't be a big problem.

The slider you refer to are for the time duration per letter erased, right?

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
12-23-2009 04:13 PM
Find all posts by this user Quote this message in a reply
rogard Offline
Junior Member
**

Posts: 6
Joined: Nov 2009
Reputation: 0

Thanks: 0
0 thank was given in 0 posts

MyMood: None
Post: #22
RE: [Kerensky] Transcript Annotations Cleaner v22-12-09
That's what I meant. Some way to adjust how big the adjustment is.

I am usually using Gaupol which is awesome for spell-checking and correcting multiple(!) subtitles simultaneously, change encoding etc., and Subtitle Workshop which is excellent splitting 3-line subtitles and adjust timings/FPS. I am dreaming of a combination of both together with the ability to auto-correct times and do a few other things as well...

Who knows, maybe your tool will become exactly what I am dreaming of...?

I'd like a feature to get rid of "forbidden" characters, either by pointing them out or by replacing them. Sometimes a subtitle is broken because of a single character ("wrong" apostrophes for example) that's not correctly encoded. (See the replace list below)

You could also check for overlapping subtitles, too short duration etc.

How about a list of words or characters that are searched and replaced automatically? I need that quite often. Over time, this can be a huge help with fixing badly spelled subtitles.

In general, automatic actions are nice, but sometimes it's much better if the software asks before it removes or changes anything. Example: remove speaker before colon, which would also remove parts of the dialog if a sentence contains a colon.

Gaupol shows a list of changes, that's great. It would be even better if I could see the pending changes for each feature, so that I canb concentrate one the critical ones that often go wrong. (Dream)

Other common errors that your software could tackle:

- wrong handling of blank spaces for abbreviations with dots (12:00 p.m. often becomes 12:00 p. m. or C.I. A. etc.
- "blabla." not "blabla. " or " blabla." (Tricky, this one...)
- Sophisticated correction of the i vs. L problem (Hard to beat subrip and gaupol there....but you can always try :-)
- correction of single characters that are not in italics, surrounded by other characters in italics. (and vice versa)
- a slash "/" often appears as < i>I< /i>
< i>blabla...< /i> not < i>blabla< /i>...
- A way to choose between [ MAN ] and [MAN]
- correct music signs and blank spaces: # blabla # not #blabla#
- get rid of unneccessary blank spaces and add blank spaces when needed: begin/end of line, before/after punctuation, before/after tags.
- Capitalize words at beginning of a line/sentence, but only if the character before is a . ? or !
- add a feature to remove speakers before a colon in CAPS like MAN:
- add a feature to remove speakers before a colon like Man: (you need to be careful there, some sentences contain a colon, so you will remove half a sentence!) Maybe ask the user before you remove it?
- sometimes ain't becomes ain' t ro Rock'n'Roll becomes Rock 'n 'Roll or hit 'em becomes hit'em.
- goin' on becomes goin'on etc.

These are all rough ideas. Maybe you can use some of them.
Thank you for working on your subtitle software, very much appreciated.

Merry Xmas everyone!
(This post was last modified: 12-23-2009 10:00 PM by rogard.)
12-23-2009 09:59 PM
Find all posts by this user Quote this message in a reply
stovokor Offline
Junior Member
**

Posts: 38
Joined: Nov 2009
Reputation: 0

Thanks: 27
1 thank was given in 1 posts

MyMood: None
Post: #23
RE: [Kerensky] Transcript Annotations Cleaner v22-12-09
Thank you Smile A nice gift for Christmas Smile
12-24-2009 11:54 AM
Find all posts by this user Quote this message in a reply
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #24
RE: [Kerensky] Transcript Annotations Cleaner v22-12-09
(12-23-2009 09:59 PM)rogard Wrote: Guests cannot see links in the messages. Please register to forum by clicking here to see links.That's what I meant. Some way to adjust how big the adjustment is.

I am usually using Gaupol which is awesome for spell-checking and correcting multiple(!) subtitles simultaneously, change encoding etc., and Subtitle Workshop which is excellent splitting 3-line subtitles and adjust timings/FPS. I am dreaming of a combination of both together with the ability to auto-correct times and do a few other things as well...

Who knows, maybe your tool will become exactly what I am dreaming of...?

I'd like a feature to get rid of "forbidden" characters, either by pointing them out or by replacing them. Sometimes a subtitle is broken because of a single character ("wrong" apostrophes for example) that's not correctly encoded. (See the replace list below)

You could also check for overlapping subtitles, too short duration etc.

How about a list of words or characters that are searched and replaced automatically? I need that quite often. Over time, this can be a huge help with fixing badly spelled subtitles.

In general, automatic actions are nice, but sometimes it's much better if the software asks before it removes or changes anything. Example: remove speaker before colon, which would also remove parts of the dialog if a sentence contains a colon.

Gaupol shows a list of changes, that's great. It would be even better if I could see the pending changes for each feature, so that I canb concentrate one the critical ones that often go wrong. (Dream)

Other common errors that your software could tackle:

- wrong handling of blank spaces for abbreviations with dots (12:00 p.m. often becomes 12:00 p. m. or C.I. A. etc.
- "blabla." not "blabla. " or " blabla." (Tricky, this one...)
- Sophisticated correction of the i vs. L problem (Hard to beat subrip and gaupol there....but you can always try :-)
- correction of single characters that are not in italics, surrounded by other characters in italics. (and vice versa)
- a slash "/" often appears as < i>I< /i>
< i>blabla...< /i> not < i>blabla< /i>...
- A way to choose between [ MAN ] and [MAN]
- correct music signs and blank spaces: # blabla # not #blabla#
- get rid of unneccessary blank spaces and add blank spaces when needed: begin/end of line, before/after punctuation, before/after tags.
- Capitalize words at beginning of a line/sentence, but only if the character before is a . ? or !
- add a feature to remove speakers before a colon in CAPS like MAN:
- add a feature to remove speakers before a colon like Man: (you need to be careful there, some sentences contain a colon, so you will remove half a sentence!) Maybe ask the user before you remove it?
- sometimes ain't becomes ain' t ro Rock'n'Roll becomes Rock 'n 'Roll or hit 'em becomes hit'em.
- goin' on becomes goin'on etc.

These are all rough ideas. Maybe you can use some of them.
Thank you for working on your subtitle software, very much appreciated.

Merry Xmas everyone!


Big list, I like it!

But some of these thing are just not doable because this app is an automatic process.
The most I can do is present a log of changes when it's done.

Already done (using last version) of your list:

- change encoding.

- remove speaker before colon, which would also remove parts of the dialog if a sentence contains a colon. <-- It won't do that.

- add a feature to remove speakers before a colon in CAPS like MAN:
- add a feature to remove speakers before a colon like Man: (you need to be careful there, some sentences contain a colon, so you will remove half a sentence!) Maybe ask the user before you remove it? <-- look in Main / Basic Options.


For the timming correction, I'm planning something you will like a lot Wink

I can also add a "Fix common text errors" function to do things like you describe with incorrect blank spaces, for a future version (The timming fix goes before).

Merry Xmas guys!
( Don't let a fat Santa stole your subs Wink )

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
(This post was last modified: 12-24-2009 01:17 PM by Kerensky.)
12-24-2009 01:16 PM
Find all posts by this user Quote this message in a reply
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #25
RE: [Kerensky] Transcript Annotations Cleaner v25-12-09
New version: v25-12-09

(updated the download link in the 1st post)


New Features:
---------------------------------

A couple of new features under "Basic Options":

[Image: tacopcionesbsicaseng.png]

- Automatic addition of leading dashes if there is only one dash for the second speaker. [requested by rogard]

Hello.
- Hi.

< i >Eo
- No< / i >

becomes:

- Hello.
- Hi.

< i >- Eo
- No< / i >



- Sort lines by their starting times and also deletes all repeated lines
(repeated line: when two lines have the same times and text)


And a screencap of the contex menu of the main window introduced in the last version:

[Image: tacmenucontexeng.png]



This will be the last version of the TAC for a while (except for bug corrections).

But don't worry Wink, it will return stronger than never, and with a new name:
Automatic Subtitle Editor

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
(This post was last modified: 12-25-2009 03:26 PM by Kerensky.)
12-25-2009 03:13 PM
Find all posts by this user Quote this message in a reply
rogard Offline
Junior Member
**

Posts: 6
Joined: Nov 2009
Reputation: 0

Thanks: 0
0 thank was given in 0 posts

MyMood: None
Post: #26
RE: [Kerensky] Transcript Annotations Cleaner v25-12-09
Looks awesome. Thank you for your efforts, kerensky. I will test it in the next few days and tell you what I think.

I have another idea: how about different profiles/presets with different settings, i.e. one for removing HI parts, another one for general corrections, yet another for very special adjustments etc....
I think that would make it more efficient to use your software for different tasks.
12-26-2009 03:00 PM
Find all posts by this user Quote this message in a reply
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #27
RE: [Kerensky] Transcript Annotations Cleaner v25-12-09
(12-26-2009 03:00 PM)rogard Wrote: Guests cannot see links in the messages. Please register to forum by clicking here to see links.I have another idea: how about different profiles/presets with different settings, i.e. one for removing HI parts, another one for general corrections, yet another for very special adjustments etc....
I think that would make it more efficient to use your software for different tasks.

Right now the app doesn't have so many options to need profiles, but I'll have it in mind for the next versions.

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
12-28-2009 07:09 PM
Find all posts by this user Quote this message in a reply
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #28
RE: [Kerensky] Transcript Annotations Cleaner v25-12-09
New version: v25-12-09.2

(updated the download link in the 1st post)


Just a bug fix.

It will fix the error witch gave the following error trace:

System.IndexOutOfRangeException: Index was outside the bounds of the array.
at LimpiaTranscript.LinSRT.EliminaRenglonesNoValidos(LinSRT Entrada)
at LimpiaTranscript.SRT.EliminaLineasVacias(SRT Entrada)
at LimpiaTranscript.LimpiaTranscript.Procesado_Click(Object sender, EventArgs e)


Big thanks to enigma92 for sending the debug info.

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
(This post was last modified: 01-08-2010 02:18 PM by Kerensky.)
01-08-2010 02:18 PM
Find all posts by this user Quote this message in a reply
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #29
RE: [Kerensky] Transcript Annotations Cleaner v25-12-09.3
New version: v25-12-09.3

(updated the download link in the 1st post)


- Another bug fix.

It will fix the error witch gave the following error trace:

System.ArgumentOutOfRangeException: Count cannot be less than zero.
Parameter name: count
at System.String.Remove(Int32 startIndex, Int32 count)
at LimpiaTranscript.LinSRT.EliminaNombresLin(LinSRT Entrada)
at LimpiaTranscript.SRT.EliminaNombres(SRT Subt)
at LimpiaTranscript.LimpiaTranscript.Procesado_Click(Object sender, EventArgs e)


- Now, the automatic correction of CC errors (like ♪) will now work properly.

Big thanks to elderman for sending the debug info.

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
(This post was last modified: 02-06-2010 02:32 PM by Kerensky.)
02-06-2010 02:26 PM
Find all posts by this user Quote this message in a reply
Kerensky Offline
Member
***

Posts: 52
Joined: Nov 2009
Reputation: 0

Thanks: 1
9 thank was given in 5 posts

MyMood: None
Post: #30
RE: [Kerensky] Transcript Annotations Cleaner v16-02-2010
New version: v16-02-2010.2

(updated the download link in the 1st post)


Fixed things:
  • Now, if an annotation has ":" in the end, like: "(Howling voice):" it will be eliminated too.
  • Smart Join/Split text:
    1. No more break line in composite words like "Star-Gate"
    2. No more break line in srt tags (only happened if the tag was in the middle of the text)

EDIT: A quick fix in for the balance text option.

I just finished my exams, so I will start soon working in the next major release.

As always, feel free to post bugs/suggestions/comments/whatever

Guests cannot see links in the messages. Please register to forum by clicking here to see links.
(This post was last modified: 02-16-2010 10:19 PM by Kerensky.)
02-16-2010 08:58 PM
Find all posts by this user Quote this message in a reply
Post Thread  Post Reply 


Possibly Related Threads...
Thread: Author Replies: Views: Last Post
Star [Kerensky] SAS v12-10-2010 BATCH Kerensky 23 1,128 01-27-2010 10:32 AM
Last Post: Kerensky

Forum Jump: