How to find your way in the Excel maze
Excel is almost certainly the file format encountered most frequently in the translation industry.
Microsoft Excel can be described as an application that utilises spreadsheets to organise numbers and data using formulas and functions. Spreadsheets may not sound like suitable containers for translatable text.
However, Excel spreadsheets do make sense when submitting content for translation and for several reasons:
Excel spreadsheets can be opened, read and amended using almost any computer or device.
It is quick and easy to copy from and paste to Excel spreadsheets.
Content can be entered into parallel columns, making data easier to organize.
Excel’s functionality is comparatively easy to master.
Excel spreadsheets can be exported to a variety of applications.
The Excel web app enables users to view and work on data sets simultaneously.
Table of contents
Multilingual Excel
Excel documents for translation commonly feature a column containing the source text (usually column A) together with further columns for each target language. In the past, it was difficult to generate a functional target file from this structure. But the latest translation technology can identify the contents of each column and where to place the various translations.
It is now possible to limit both the preparation and post-processing work to one source file import and one target file export, regardless of the number of target languages.
Existing translations can be ignored, imported for revision or marked as locked in a multilingual Excel document. When locked, translations can still be referenced but are not included in the word count.
Metadata
Excel documents frequently contain metadata with the potential to enrich the translation process. There are three types of metadata that can commonly be extracted from Excel sheets: comments, context IDs, and character limitations. These can be transferred to the TMS for the benefit of translators.
Comments
Apposite comments are highly beneficial for any translation project. Valuable data such as processing instructions or indications of style and tone help to ensure superior outputs. If your source Excel document features one or more columns for comments, always import these into your TMS, giving linguists direct access to them while translating.
The volume of reference material to be sent via mail will be reduced as a result.
Software Strings
Software strings or website pages in Excel documents are usually accompanied by contextual information known as IDs. Almost any data can constitute an ID including software keys, providing this is unique to a source cell.
When you import contextual information together with the source cells to the TMS, the data will be stored with every translation in the translation memory. The contextual information enriches translation memory matches with both structural context and traditional textual context.
Character Limitations
Character limitations are important aspects of website localisation. They enable translators to avoid conflicts with web and document designs but can present significant obstacles, depending on the target language.
Character restrictions often tempt linguists to experiment with cumbersome formulas or even macros after an Excel document has been translated. But there are easier solutions. Spreadsheet columns containing character limitations may be imported to and reproduced in the TMS. Translators will receive a warning if they exceed character limits. When a length issue isn’t identified during translation, it will be reported when running a QA (quality assurance) check.
Embedded content
When content has simply been copied and pasted to an Excel worksheet, it may feature embedded file formats such as HTML, JSON or XML. It isn’t unusual to discover a combination of embedded formats including HTML tags in JSON objects.
Embedded formatting could cause complications for localisation engineers. Fortunately, most translation management systems are equipped to process most formatting smoothly.
Every respected translation tool supports embedded HTML while more complex formatting can often be tackled with the use of regular expressions.
Complications, tips and tricks
There are further complications that present obstacles to the successful translation of content in Excel documents. For instance, a client might use a background colour to indicate cells which are to be ignored. This situation could be tackled by hiding cells in the source document. However, you might find that your TMS recognises background colour as an indicator of non-translatable content.
Formulas are also potential obstacles to successful translations. If the content of a cell results in no translated content in the TMS, it is likely that the data in the source cell has been generated by a formula.
This situation can be overcome prior to exporting the document to the TMS by copying the source content and then pasting it back to the cell as a value rather than as a formula. This can be achieved by using the Paste Values & Source Formatting option in Excel:
Select all the cells containing formulas that you wish to convert.
Press Ctrl + C or Ctrl + Ins to copy the formulas and their results to the clipboard.
Press Shift + F10 and then V to paste only values back to Excel cells.
Situations may arise with Excel documents which a TMS alone cannot tackle, such as the inclusion of different non-translatable cells per target language or cells which consist of formulas with translatable text portions. It is worth investing in tools which feature extensions for processing Microsoft Excel content.
We would highly recommend Kutools for Excel. This efficient add-on features more than 300 advanced functions. It provides the perfect safety net when a TMS struggles to unravel the complexities of a spreadsheet.
Conclusion
Microsoft Excel is a highly versatile application. It was designed as an accounting tool but is now utilised for a variety of purposes including the holding of source data for translation. The advanced functionality of Excel is impressive.
However, when those entering data for translation take advantage of that functionality, the resulting content can be problematic for translation technology.
The latest systems are equipped to deal with many of the potential issues that could arise, but certain scenarios remain beyond the capabilities of a TMS. Such scenarios could impact the quality of translations. Manually amending outputs to enhance their quality will reduce the profitability of any project.
By investing in knowledge and technology, it is possible to overcome the challenges presented by Excel when it is used as a file format to hold content for translation.
#machinetranslation #XML #xmultimodalcontextawaretranslationframework