Jump to content

A Few Questions About Subtitle Format That I Want To Develop


8day

Recommended Posts

For quiet a while I want to create some new subtitle format (hence, I must develop renderer and editor as well). I myself used to make typeset for anime fansub, but after a few years have noticed that subtitle formats as well as the vision of the nature of subtitles is a bit skewed. E.g., why there is no common format to store translated data from audio & video in a separate & optional form (dialogs, actions, sounds etc.; in this case reader could've simply enabled desired category of information, maybe even some facility to show/discern between different sources of info/actors), why we don't have something similar to text wrapping between pages from Word documents, but between subtitles (will be necessary with insertion of actor's name)... ATM the only advanced part of computer-aided subtitling is typesetting and, to some extent, speech recognition, but there's so much more... maybe even text compression according to desired ratio of chr/sec.

Basically, I see the format as a point at which all translated data from audio & video is gathered: you translate audio & video into text, just dump the information they contain (speech, songs, titles), describe sources of that info to assign meaning to data (in TTML terms it would be a role), add timings etc. and maybe some predefined "style sheet". One of the central concepts is that style sheets can be changed by the user (thanks to standardized meta-data, i.e. TTML "roles", which will be an interface between data and styles). Style sheets will not be limited to coloring etc., they could describe layout and different behavior like how to handle collisions (which is extremely important with inter-subtitle text wrapping), whether to show source of info/actor (this one is the source of most complexity, but I think it must be done). The only thing I'm afraid is that because the format won't be as versatile as Advanced SSA, it may be much less popular (the only thing that keeps my hopes is the popularity of SubRip, which is very simple in terms of typeset).

ATM the problem I'm facing is very complex, and most likely is over my head and can take a lot of time, so I thought that before I'll start working on this, I'll better ask someone if the ideas I have is sane and worth spending time. Well, you bet I'm sure they are worthwhile, but I've noticed that fansub is slowly dying: people have more interesting things to do, like play games, etc. So it very well may happen that after I'll make it usable, there won't be anyone to use it.

Link to comment
Share on other sites

The point is to make subtitles more versatile. Different people have different preferences: how collisions should be formatted, stylized (prefixed with em-dash and whitespace in every line, only in first line, colored in a different way and even combined in one line by prefixing en-dash at the beginning of every "subtitle"), what should be shown (subtitle editor comments, actor name, sounds, primary and/or non-essential titles). All that leads to the desire to "fix" something by removing HI-related text, re-stylize etc. which in it's own place leads to multiple versions of subs, that essentially are the same, separation & duplication of work (if some error from original subtitle were transferred to fork#2 and fork#3, then that people probably will have to fix them by themselves, which may never happen and will lower the average quality), etc. It's much wiser to save only one version per author of translation/transcription, rather than create a separate copy for every occasion. Basically, authors of subtitles should fill documents with data and timing, while the choice of what to show and how to show will be left to users.

But I guess it may be that there's no good enough point. A bit of this, a bit of that. E.g., current champion, which is SubRip, is too simple, Advanced SSA is... well, "free" and somewhat ugly: it does not have proper explicit standard (I guess implicit-one would do, but most likely it means to implement every feature ever added, maybe even including AS5 functionality from VSFilterMod).

BTW, I don't like XML. If I happen to finalyze the format, it'll probably be in JSON/JSON-LD/YAML.

Oh, and YAML reminded me that most fansubers hate XML, hence USF, SSF, TTML, ... WebVTT in some respect is better than SubRip, but... it is simple and complex altogether: it stands on top of a huge stack of web standards which can be properly implemented only by a huge group of smart people ==> no proper support in a "free" software which happens to be the best ATM.

 

Updated @ 9.6.2014

It seems that I've misunderstood you... Yes, basically, on its lowest level, it's CC. The problem is that in order to implement it the way I see it, i.e. to be versatile/extensible/agile/whatever, I need to write a complete subtitling station, like the one k0tus/kotus wrote (SubStation Alpha). Partially that's because IMO there's no good renderer (good enough for high quality and optional/extensible visuals) as well as editor (most are very low level, with a bit dated concepts/visions, or maybe just feel not fluid enough...), although there's a few interesting softwares (Aegisub, Gaupol, Subtitle Edit, Subtitle Workshop, Gnome Subtitles (?)). In any case, yesterday I had a huge progress in design domain (yay!), so ATM the idea seems more realistic than ever: complex, yes, but possible.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Member Statistics

    26317
    Total Members
    6268
    Most Online
    libussa
    Newest Member
    libussa
    Joined
×
×
  • Create New...