Click here to Skip to main content
12,634,788 members (28,827 online)
Click here to Skip to main content
Add your own
alternative version

Tagged as

Stats

24.9K views
1.7K downloads
7 bookmarked
Posted

Convert Japanese string to Romaji

, 13 Jun 2010 CPOL
Rate this:
Please Sign up or sign in to vote.
Convert Hiragana and Katakana string to Romaji

Introduction

This simple tool converts a Japanese string to the equivalent Romaji string.

Screenshot of Japanese to Romaji converter.

Background

The romanization of Japanese is the application of the Latin alphabet to write the Japanese language. This method of writing is known as Romji. The Japanese language is written with a combination of three scripts: Chinese characters called kanji, and two syllabic scripts made up of modified Chinese characters, Hiragana and Katakana. Here I converts the Hiragana and Katakana letters to Romaji.

Using the Code

The Unicode chart guided me to build these tools:

When I started studying Japanese, it was very difficult to identify a Japanese text. I thought if a tool can convert Japanese to Romaji, then it will be good. After reading the Unicode range of Japanese characters, I got an idea to create the tool myself. There are a lot of tools available on the internet for converting Japanese to Romaji. Here is a simple technique to convert Japanese to Romaji.

// This function parse the given file and output pUnicodeMap.
// Returns the count of parsed elements.
int Parser::ParaseDB( TCHAR* pDBFileData_i, std::map<int,std::wstring>& pUnicodeMap ); 

The language information is added as two resource files, Hiragana.txt and Katakana.txt. These files are read from resource using the following code:

HRSRC hrInfo = FindResource( 0, MAKEINTRESOURCE
			( IDR_IDR_HIRAGANA ),TEXT("IDR_DATA"));
HGLOBAL hRc =  LoadResource(0, hrInfo  );
TCHAR* pBuffer = (TCHAR* )LockResource( hRc );

Parser p;
// Parse the database of Hiragana.txt and update unicode map.
p.ParaseDB(pBuffer, m_UnicodeMap);
	
HRSRC hrInfoKatakana = FindResource
	( 0, MAKEINTRESOURCE( IDR_IDR_KATAKANA ),TEXT("IDR_DATA"));
HGLOBAL hRcKatakana =  LoadResource(0, hrInfoKatakana  );
TCHAR* pBufferKatakana = (TCHAR* )LockResource( hRcKatakana );

// Parse the database of Katakana.txt and update unicode map.
p.ParaseDB(pBufferKatakana, m_UnicodeMap);

Points of Interest

The Parser class can parse a text file containing Unicode value and corresponding English string. ParseDB() function parses the data in the below format and outputs a map containing all elements in the input file.

BEGIN_MAP:

//?

3093,12435,N.

      ..

      ..

END_MAP:

History

  • 13th June, 2010: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Santhosh G_
India India
No Biography provided

You may also be interested in...

Comments and Discussions

 
QuestionHOW TO RUN? Pin
Gerome Gibz Teodosio4-Oct-12 6:21
memberGerome Gibz Teodosio4-Oct-12 6:21 
Bugsome errors in translation Pin
NZ Programmer21-Sep-12 15:03
memberNZ Programmer21-Sep-12 15:03 
GeneralKanji Pin
Claunia18-Jun-10 4:05
memberClaunia18-Jun-10 4:05 
GeneralRe: Kanji Pin
santhosh4gCode20-Jun-10 5:59
membersanthosh4gCode20-Jun-10 5:59 
GeneralDomo arigato! Pin
xawari15-Jun-10 6:46
memberxawari15-Jun-10 6:46 
GeneralNice idea... Pin
Michael E. Jones14-Jun-10 2:35
memberMichael E. Jones14-Jun-10 2:35 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.161208.2 | Last Updated 13 Jun 2010
Article Copyright 2010 by Santhosh G_
Everything else Copyright © CodeProject, 1999-2016
Layout: fixed | fluid