1. Can AI or machine translation alone (e.g. DeepL, Phrase, etc.) provide the level of accuracy needed for technical subtitles; do I really need to hire a specialized subtitle translation company?
Why it’s important
Using AI to generate multilingual captions can be great as a generic solution but AI alone doesn’t do so well with highly technical, specialized or mission-critical content, the type of content typically found in webinars, tutorials and eLearning applications.
Using AI or MT alone for this kind of content can be risky for two main reasons:
1. Translation errors can damage your brand. This can result in loss of market share and a lower adoption rate for your product or platform overseas.
2. If your target audience includes developers who are integrating their code with your platform, inaccurate translations can lead to coding errors, ultimately translating into higher support costs for you.
The best vendors treat AI/MT as a productivity boost, not a standalone solution, They incorporate AI or MT in the workflow as a first translation pass, producing a workable draft before post-editing and clean-up by subject-matter experts in subsequent steps.
Choosing a vendor who specializes in translating technical subtitles can add value in two ways:
1. They can improve the quality of the initial AI translation by providing glossaries, translation memories and fine-tuning the system for the particular technical domain.
2. They can include human subject-matter experts in the workflow to post-edit, adjust subtitle formatting and timing, and check synchronization with the video. (Spoiler alert: LAI has developed a system that can do many of these tasks automatically.)
2. How can I be sure the translations you provide will be accurate for my technical domain?
Why it’s important
Every translation provider has their specific areas of expertise, with some being stronger in particular technical domains than others.
Be wary of providers that claim to translate all languages for all industries across all technical domains. Every company has its strengths and weaknesses. The company should be transparent with you on what those are.
The translation provider you choose needs to satisfy two minimum requirements:
1. They need to have experience translating technical content and
2. They need to have experience in your particular domain
This ensures that both the AI model and the linguistic team are optimized to deliver the highest quality for your content. A translation provider can say that QA is part of their workflow, but what exactly do they mean by QA and will they even recognize translation errors when they occur? Probably not if the provider has no relevant experience in your domain.
Of course, you almost never find an exact fit but it should be close enough that you have confidence in the quality of the finished translations.
Certifications are fine but they’re more about process and don’t reflect expertise in a particular domain. In many cases the bar is fairly low and these are not difficult to obtain (that’s why so many companies have them).
Of course, the best way to know how well a translation provider will perform on your content is to ask them to do a short test (they shouldn’t charge you if you keep the size under 200 words).
Don’t just reverse translate through Google to check it. You should have a native speaker in the target language who is knowledgeable in your domain, preferably, a member of your own team. The person doesn’t need to be a translator; in fact it’s probably better that they aren’t, considering that your target audience is not translators.
3. What is your translation workflow?
Related questions
• What is the role of AI, linguists and QA in the workflow?
• Do you use third-party or proprietary tools in the workflow?
• Is your workflow sufficiently agile to respond quickly to changes and updates?
• What is the typical response time?
Why it’s important
It’s important to ask these questions so you can understand how your content is handled through the provider’s workflow and have confidence in the technical accuracy and quality of their translation. You also need to know how you as the client will integrate into their workflow to improve efficiency and reduce turnaround time.
If your content resides in a CMS (Content Management System) or LMS (Learning Management System), it’s also useful to ask the vendor how they can integrate programmatically with your internal workflow.
This is less important in the near term until you’ve established a good working relationship with the vendor and are satisfied with the quality of the translations. Depending on your needs, including frequency and volume of updates, it may not be worthwhile expending the effort or resources on a sophisticated integration strategy since a simple export of the source file, and later reimport of the translated files may be more than adequate.
In any case, understanding the main steps of the workflow and process is important for making an informed decision when choosing a vendor.
4. How is the AI or MT tuned for my particular technical domain?
Why it’s important
Tuning an LLM or neural MT engine for your technical domain is an important step because generic engines don’t perform well on highly technical content. Put another way, without tuning, the engine’s output will require more post-editing and cleanup by human subject-matter experts, making it more difficult to scale up the process.
In addition, if the translation quality from the engine is too low, it increases the risk of errors slipping through post-editing into the final translation. In extreme cases it may even be necessary to completely retranslate the content by human subject-matter experts, which of course negates the benefit of using the AI or MT in the first place.
The actual steps in tuning can encompass a wide range and will depend on the technical depth of the content, the type of engine being used and the availability of additional content resources but it will typically involve providing the engine with aligned translation pairs (such as a translation memory), as well as bilingual glossaries of specialized terms.
For an LLM, providing monolingual texts can also be helpful as it enables the model to learn the syntax, terminology and style of the target domain.
Besides fine-tuning, other techniques such as RAG (Retrieval Augmented Generation) can also be used effectively. Some hybrid schemes make use of a combination of more than one of these techniques.
5. What is your process for selecting and vetting linguists for my project?
Why it’s important
This is an important question to ask because of the way a translation vendor typically builds a linguistic team to work on a specific client project. Choosing the right linguists for the team who are well-matched for the technical domain is critical to ensuring the quality of the final translation.
This is no different from a software development manager building a team of individual developers for developing a specific product. The success of the product is highly dependent on the manager’s ability to build a team with the necessary background and experience who can successfully execute on the project.
Unfortunately, this is a real problem for most translation vendors because the project manager or vendor manager who is responsible for building the linguistic team typically has no relevant technical background at all, nor any understanding of the client’s technical domain. As a result, they are unable to choose linguists/post-editors who are most suitable for handling your technical content.
Even if they have linguists in their pool who have technical backgrounds or real industry experience in the domain, it’s not likely they would understand it, even if the linguist has documented it on their resume!
There are three points to consider to help mitigate this problem:
1. Ensure that the translation vendor has one or more senior managers who do have the necessary technical background (and preferably industry experience) to be able to assist less technical project managers when selecting appropriate linguists for your project.
2. Ask about internal communication flow within the vendor’s organization. Look for a vendor with good internal communication (oftentimes the smaller vendors are better at this).
3. Ask the translation vendor to take a short test (they shouldn’t charge you if you keep the size under 200 words). Then you can have a member of your technical team (or someone you trust) review the translation to check its quality.
If the vendor does well on the test, before proceeding be sure to ask them if the linguists who worked on the test will be the same ones who work on your actual project(s). Some less scrupulous vendors perform “bait and switch” and will have their best linguists write a translation test only to move them off to another project (possibly from a larger client) so the actual translation quality during production will be lower as a result.
Don’t be fooled by translation vendors who tell you one of their benefits is that they have fulltime on-staff linguists. This can actually be a disadvantage because the vendor will want to keep those linguists busy even if they’re not well-matched to your technical domain.
6. What subtitle formats do you support?
Related questions
• SRT vs. VTT: Which caption format is best?
• Can you deliver the translation to us in the same format as the source?
Why it’s important
Good vendors should be sufficiently flexible to handle any format that you provide such as SRT(SubRip) and VTT(Video Text). Most LMS (Learning Management Systems) and video authoring platforms can easily export into either format.
If you’re starting small and you can’t generate a subtitle file easily, ask the translation provider if they can create a transcription of your video and provide it as an SRT or VTT file. Most providers can do this easily and will sometimes give you a discount as part of a package combining transcription and translation.
Make sure that you are in the loop to sign off on the transcription before you commit to translation, especially if you have multiple language targets. Just like in software development, bugs are much easier and cheaper to fix if they’re caught early.
If your provider has experience and is knowledgeable in your technical domain (see previous question), they should have an internal QA pass to check the quality of the transcription before they deliver it to you, so only minimal editing should be needed.
If a transcription is needed, it can also be helpful to create a glossary of proprietary, brand-specific or technical terms in advance. This will help guide the AI as it’s generating the transcription.
In terms of which format is best, it really depends on the platform and application. Realistically speaking, all subtitle formats have the same basic information: a start timecode, an end timecode and the text of the audio spoken during that time. The main difference between formats is how this information is formatted within the file.
However, there are subtle differences between formats. VTT is a more recent format and can store additional metadata for each subtitle, such as the styling of the subtitle text.
7. What are your naming conventions for subtitle files on a multilingual project?
Why it’s important
Ideally, the translation provider should use a standardized format for its filenames that makes it easy to understand the content of the files and the language. That will save you a lot of grief and make it easier to import the file back into your tool or video platform, especially as the number of files multiplies as you add more languages and additional content sources.
Of course, this becomes less of an issue if your translation provider can integrate directly with your LMS/platform (see next question).
8. Can you integrate directly with our webinar platform or LMS?
Why it’s important
This question is more important once you’ve established a working relationship with a subtitle translation provider and are ready to scale up in volume, or into more languages. But it can also be important to ask early on to make sure you have a growth path going forward. Dealing with individual files, as described in the previous question, can quickly become a bottleneck and you don’t want to discover this late in the game.
Most major platforms support a REST API that allows source subtitle content or files to be downloaded and translations to be uploaded. The translation vendor you choose should be able to integrate this API directly into their workflow. Since these are normally platform-specific and have not been universally standardized, some glue code may need to be developed to provide this integration, but it should not require a significant effort on the part of the vendor nor should you have to own the cost of doing this.
The ideal vendor will have an internal development team that can take care of this.
9. Can you leverage previous translations to support changes and updates?
Why it’s important
The ability to leverage previous translations from one project to another is provided by translation memory, which is a feature of all CAT (Computer Assisted Translation) tools. The more sophisticated tools can also function as version control systems enabling the provider to track and revert translation changes easily.
This question is important for two reasons:
1. By leveraging previous translations, you avoid having to retranslate the same content when a video or subtitle file is updated, thereby keeping cost down and improving turnaround time. By leveraging previous translations, you avoid having to retranslate the same content when a video or subtitle file is updated, thereby keeping cost down and improving turnaround time.
2. If you decide at some point to change translation vendors, having the translations stored in a translation memory protects your investment, making it easier for a new vendor to match the style and voice of your brand, which helps maintain consistency.
The translation memory also helps to ensure that technical terms are translated consistently. It can also be used to train an LLM or neural MT engine by providing it with aligned, bilingual texts.
In addition, by connecting the CAT tool to your LMS or video platform via an API, the translation workflow and leveraging process can be highly automated, making it ideal for responding rapidly to changes and updates whenever the video and associated subtitles are updated.
10. How do you maintain synchronization between the video and the translated subtitles?
Related question
• How do you ensure that the subtitles are displayed on-screen long enough so the viewer has time to read them?
Why it’s important
The main issue here is that you need to make sure that the video and translated subtitles are synchronized so that technical information displayed on-screen matches the text displayed in the subtitle.
At the same time, the text also needs to remain on-screen long enough so the viewer has time to not only read it, but also to comprehend the information being communicated. This issue is known as subtitle pacing.
At first glance, you’d think this would be trivial. After all, when subtitles are translated, the timecode ensures that the translated subtitle is displayed at precisely the same time as the original English, and that the pacing will match the spoken word. So what’s the problem?
The answer is: it’s complicated.
There are two related issues to be concerned with here:
1. Per-language issues. Studies have shown that optimal subtitle pacing varies with the language. Some languages require a lower subtitle density (fewer characters per subtitle) to achieve the same level of comprehension compared to English.
2. Translation accuracy. To maintain proper pacing, it’s often necessary to condense the translation because some languages require more characters to communicate the same information. This is known as text expansion.
Most translation providers are not really aware of either of these issues. What they do is simply condense the text so it fits into the space allotted for the English. However, there are two problems with this approach.
The first is, if this is being done by non-subject matter experts, there is a risk of losing critical information because the linguists doing the actual work simply do not have the background or experience to understand the technical content. Therefore, they have no way of knowing if the final translation is a technically accurate and faithful rendering of the original English.
The other problem is it fails to address issue (1) above, taking into account different pacing requirements per language.
The result is that critical information can be lost and depending on the target language, the subtitles may not be displayed on-screen long enough for the viewer to fully comprehend them.
The impact is similar to the problems that can occur when subject-matter experts are not included in the post-editing of a machine-generated translation:
• Poor-quality or inaccurate translations can damage your brand, resulting in loss of market share and a lower adoption rate for your product or platform.
• If your target audience includes developers who are integrating their code with your platform, inaccurate translations can lead to coding errors, ultimately translating into higher support costs for you.
Unlike typical subtitle translation providers, LAI has developed a comprehensive solution that addresses all of the issues described above.
The company has developed a unique tool that automatically adjusts subtitle timing to maintain synchronization with the video and provide correct pacing for the intended audience.
When it becomes necessary to condense the translation so it will fit within the variable display window allocated for the subtitle, our teams of subject-matter experts ensure that no information is lost.
The result is adequate pacing and display time of subtitles for the viewer so that the subtitles can be read and understood at a natural pace while ensuring that technical information in the original is completely and accurately communicated in the translation.
For a more in-depth discussion of the issues and pain points associated with technical subtitle translation along with a detailed description of LAI’s innovative solution, please download our white paper, “Improving the Quality of Subtitle Localization for Technical Content”.
Get in Touch
Our friendly team would love to hear from you.
- hello@lai-techtr.com
- +1-650-571-7877
- 204 2nd Ave. Suite 128 San Mateo, California 94401 USA