I'm working on rebuilding a site (using CodeIgniter incidentally) and I have noticed that we have a pretty sizeable collection of language strings such as user prompts, page descriptions, form labels, etc. Our current language string organization is not ideal for a few reasons:
1) We have our language strings spread out over some 45 files in 14 subdirectories for English alone. Multiply that by the number of languages and you can get a feel for the file management burden that could develop. The files are roughly organized to associate one language file with each major page or controller in our MVC scheme.
2) The format of each language file is pretty simple PHP, but it could be more efficiently stored. Here's an example:
$lang['login_meta_title'] = 'Login';
$lang['login_meta_keywords'] = 'Login';
$lang['login_meta_description'] = 'Login';
$lang['login_heading'] = 'Login';
$lang['login_username'] = 'Username';
$lang['login_password'] = 'Password';
$lang['login_login_now'] = 'Login';
$lang['login_forgot_username'] = 'Forgot username?';
$lang['login_forgot_password'] = 'Forgot password?';
$lang['login_not_a_registered_user'] = 'Not a registered user?';
$lang['login_register_now'] = 'Register now';
Seems to me some INI file format would get rid of all the $ and quotes and semicolons.
3) When you are working in a controller, you must specify which language file(s) must be loaded to define the strings you expect to you use. Remembering which file contains which prompts is likely to be a source of errors at some point.
4) This method of organizing language strings can introduce redundancy. E.g., the word 'login' above but also when two distinct controllers or pages refer to the same word.
5) Between the redundancy and file management concerns, translation promises to have some complicated aspects to it.
6) Constructing some interface to assist with language translation would be tricky because:
a) we'd need to concot something to traverse our directory structures and make sure each en file had a match in es or de or fr folders
b) writing the values back to the storage format (a PHP file that defines an array) is a bit awkward as one must output PHP rather than CSV or JSON/whatever.
c) language strings are identified by path/file/array key.
Despite these possible issues, the usage in a controller is pretty easy, just load some language strings like so:
$this->lang->load('search/search_products', $this->language); // first param is the path to a language file, the second is the current language, e.g., "en" for English.
$description = $this->lang->line('search_results_meta_description');
What is the advantage of this method?
1) We don't have a single massive language file that must be evaluated for each page request, possibly causing performance issues
2) path/file/key storage allows re-use of simple key names without name collisions
3) I dunno....smaller files easy to comprehend?
Sooooooo it ocurred to me that a language registry (and possibly other types of registry) that rely on Memcache might be a superior alternative for these reasons:
A) Storage in RAM should mean performance that is orders of magnitude faster than parsing some disk file
😎 A single registry object should allow storage of all the strings in one file, which is certainly much simpler without performance penalty of parsing one huge massive file for every page access
C) We could use JSON or INI file storage for easy export to permanent storage and/or comparison with other language translations without having to parse some weird directory structure.
I'm thinking this would work something like this.
If a user requests a language string in a particular language and the registry hasn't been instantiated, we parse the corresponding language file and use Memcache the name-value pairs under the appropriate language
If a user requests a language string and the corresponding registry exists, we simply return it using Memcache.
Drawbacks?
If the number of language strings is quite large, It might take a moment to parse a big language file. I don't expect this to be a significant problem and, if this is slow, it would only need to happen on initialization or when the cache has expired (which could be quite a long time)
Refreshing the query cache on the live server when we want to update the language files might require some creativity
* Is Memcache reliable and stable? Will strings be vulnerable to corruption if they reside in memory a long time?
Anyone have any thoughts?