Click here to Skip to main content
15,904,023 members

Welcome to the Lounge

   

For discussing anything related to a software developer's life but is not for programming questions. Got a programming question?

The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.

 
GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch12-May-24 8:37
mvahoney the codewitch12-May-24 8:37 
GeneralRe: There are many gotos, but these ones are mine Pin
jmaida12-May-24 14:46
jmaida12-May-24 14:46 
GeneralRe: There are many gotos, but these ones are mine Pin
Daniel Will13-May-24 17:09
Daniel Will13-May-24 17:09 
GeneralRe: There are many gotos, but these ones are mine Pin
Amarnath S11-May-24 17:02
professionalAmarnath S11-May-24 17:02 
GeneralRe: There are many gotos, but these ones are mine Pin
trønderen11-May-24 17:40
trønderen11-May-24 17:40 
GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch11-May-24 18:01
mvahoney the codewitch11-May-24 18:01 
GeneralRe: There are many gotos, but these ones are mine Pin
trønderen12-May-24 8:15
trønderen12-May-24 8:15 
GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch12-May-24 8:28
mvahoney the codewitch12-May-24 8:28 
trønderen wrote:
If you have any reason at all to relate to the generated code. Trading readability and maintainability for "slightly faster code" is generally a bad move.
. I generally agree with you. However, as we both know there are exceptions, which is why you used the word "generally" I'm sure. This is one of those cases, as lexing is always in a critical code path, and a generalized lexer must be able to handle bulk input as efficiently as possible.


My input to visual fa is one or more regular expressions. Literally just that. Here is the full input for that generated lexer, in my .rl Rolex lexer format, but it should be easy enough to discern the grammar below without knowing the format.

Object = "{"
ObjectEnd = "}"
Array = "["
ArrayEnd = "]"
FieldSeparator = ":"
Comma = ","
Number = '-?(?:0|[1-9][0-9]*)(?:\.[0-9]+)?(?:[eE][+-]?[0-9]+)?'
Boolean = 'true|false'
Null = "null"
String = '"([^\n"\\]|\\([btrnf"\\/]|(u[0-9A-Fa-f]{4})))*"'
WhiteSpace = '[ \t\r\n]+'


The table driven code is run on a flat array of integers. It might be more efficient to unflatten it in this case - maybe? I used to run a more complicated array of structs for this, and I don't remember there being a performance difference. But anyway, there is also an array of int arrays for a feature called block ends, which simulate lazy matching on a DFA. (I have the details of all of it documented in my Visual FA series). It's also simpler in operation than it looks. I do actually use gotos in a couple of places here to restart the state machine. It was much less complicated than orchestrating a while with breaks. I should state that I didn't comment the code here because it wouldn't help me. It may help others, but I didn't really care about that. This pattern is burned into my brain after writing more than half a dozen lexers that follow the same. It honestly would just clutter it for me, as the code makes immediate sense to me despite how it looks, and I didn't write it for a team.

C#
private FAMatch _NextImpl(
#if FALIB_SPANS
	ReadOnlySpan<char> s
#else
	string s
#endif
	)
{
	int tlen;
	int tto;
	int prlen;
	int pmin;
	int pmax;
	int i;
	int j;
	int state = 0;
	int acc;
	if (position == -1)
	{
		// first read
		++position;
	}
	int len = 0;
	long cursor_pos = position;
	int line = this.line;
	int column = this.column;
	int ch = -1;
	Advance(s, ref ch, ref len, true);
	start_dfa:
	acc = _dfa[state];
	++state;
	tlen = _dfa[state];
	++state;
	for (i = 0; i < tlen; ++i)
	{
		tto = _dfa[state];
		++state;
		prlen = _dfa[state];
		++state;
		for (j = 0; j < prlen; ++j)
		{
			pmin = _dfa[state];
			++state;
			pmax = _dfa[state];
			++state;
			if (ch < pmin)
			{
				state += ((prlen - (j + 1)) * 2);
				j = prlen;
			}
			else if (ch <= pmax)
			{
				Advance(s, ref ch, ref len, false);
				state = tto;
				goto start_dfa;
			}
		}
	}
	if (acc != -1)
	{
		int sym = acc;
		int[] be = (_blockEnds != null && _blockEnds.Length > acc) ? _blockEnds[acc] : null;
		if (be != null)
		{
			state = 0;
			start_be_dfa:
			acc = be[state];
			++state;
			tlen = be[state];
			++state;
			for (i = 0; i < tlen; ++i)
			{
				tto = be[state];
				++state;
				prlen = be[state];
				++state;
				for (j = 0; j < prlen; ++j)
				{
					pmin = be[state];
					++state;
					pmax = be[state];
					++state;
					if (ch < pmin)
					{
						state += ((prlen - (j + 1)) * 2);
						j = prlen;
					}
					else if (ch <= pmax)
					{
						Advance(s, ref ch, ref len, false);
						state = tto;
						goto start_be_dfa;
					}
				}
			}
			if (acc != -1)
			{
				return FAMatch.Create(sym,
#if FALIB_SPANS
					s.Slice(unchecked((int)cursor_pos), len).ToString()
#else
					s.Substring(unchecked((int)cursor_pos), len)
#endif
					, cursor_pos, line, column);
			}
			if (ch == -1)
			{
				return FAMatch.Create(-1,
#if FALIB_SPANS
					s.Slice(unchecked((int)cursor_pos), len).ToString()
#else
					s.Substring(unchecked((int)cursor_pos), len)
#endif
					, cursor_pos, line, column);
			}
			Advance(s, ref ch, ref len, false);
			state = 0;
			goto start_be_dfa;
		}
		return FAMatch.Create(acc,
#if FALIB_SPANS
					s.Slice(unchecked((int)cursor_pos), len).ToString()
#else
					s.Substring(unchecked((int)cursor_pos), len)
#endif
			, cursor_pos, line, column);
	}
	// error. keep trying until we find a potential transition.
	while (ch != -1)
	{
		var moved = false;
		state = 1;
		tlen = _dfa[state];
		++state;
		for (i = 0; i < tlen; ++i)
		{
			++state;
			prlen = _dfa[state];
			++state;
			for (j = 0; j < prlen; ++j)
			{
				pmin = _dfa[state];
				++state;
				pmax = _dfa[state];
				++state;
				if (ch < pmin)
				{
					state += ((prlen - (j + 1)) * 2);
					j = prlen;
				}
				else if (ch <= pmax)
				{
					moved = true;
				}
			}
		}
		if (moved)
		{
			break;
		}
		Advance(s, ref ch, ref len, false);
	}
	if (len == 0)
	{
		return FAMatch.Create(-2, null, 0, 0, 0);
	}
	return FAMatch.Create(-1,
#if FALIB_SPANS
					s.Slice(unchecked((int)cursor_pos), len).ToString()
#else
					s.Substring(unchecked((int)cursor_pos), len)
#endif
		, cursor_pos, line, column);
}

Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix

GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch12-May-24 13:56
mvahoney the codewitch12-May-24 13:56 
GeneralRe: There are many gotos, but these ones are mine Pin
den2k8812-May-24 23:08
professionalden2k8812-May-24 23:08 
GeneralRe: There are many gotos, but these ones are mine Pin
trønderen13-May-24 0:17
trønderen13-May-24 0:17 
GeneralRe: There are many gotos, but these ones are mine Pin
giulicard13-May-24 6:59
giulicard13-May-24 6:59 
GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch13-May-24 9:21
mvahoney the codewitch13-May-24 9:21 
GeneralRe: There are many gotos, but these ones are mine Pin
giulicard13-May-24 10:03
giulicard13-May-24 10:03 
GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch13-May-24 10:04
mvahoney the codewitch13-May-24 10:04 
GeneralRe: There are many gotos, but these ones are mine Pin
giulicard13-May-24 10:07
giulicard13-May-24 10:07 
GeneralRe: There are many gotos, but these ones are mine Pin
trønderen14-May-24 5:10
trønderen14-May-24 5:10 
GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch14-May-24 6:21
mvahoney the codewitch14-May-24 6:21 
GeneralRe: There are many gotos, but these ones are mine Pin
trønderen14-May-24 13:12
trønderen14-May-24 13:12 
GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch14-May-24 13:14
mvahoney the codewitch14-May-24 13:14 
GeneralRe: There are many gotos, but these ones are mine Pin
trønderen14-May-24 14:12
trønderen14-May-24 14:12 
GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch14-May-24 14:13
mvahoney the codewitch14-May-24 14:13 
GeneralRe: There are many gotos, but these ones are mine Pin
trønderen14-May-24 14:25
trønderen14-May-24 14:25 
GeneralRe: There are many gotos, but these ones are mine Pin
honey the codewitch14-May-24 14:31
mvahoney the codewitch14-May-24 14:31 
GeneralRe: There are many gotos, but these ones are mine Pin
trønderen14-May-24 15:02
trønderen14-May-24 15:02 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.