Click here to Skip to main content
15,886,199 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
I personally know of only 1 C++ interpreter, which is used by Root (CINT, developed at CERN). It made me wonder why there are so few, and the obvious answer is that C++ is a very complex language to parse. No one in their right mind would bother writing a C++ parser from scratch...

However, there are C++ parsers are out there as part of compilers, some of which are open source. It occurred to me that it might be possible to interpret the assembly code that is emitted by the parser on the go. This would be like a JIT compiler that emits bytecode to be interpreted by a virtual machine. One problem that comes to mind is the use of different source-files which would otherwise lead to several object files that need to be linked. This is indeed a limitation, but not a very serious one if you ask me.

Still, it appears that no one has done this, which normally means that it isn't as easy as it seems (not that my idea would be trivial, but still). I was wondering what the problems with my idea would be... Anyone care to comment?
Posted
Updated 15-Jul-14 7:33am
v4
Comments
[no name] 15-Jul-14 13:36pm    
Does C++/CLI count? The compiler / linker gives you CLR bytecode assemblies.
Sergey Alexandrovich Kryukov 15-Jul-14 13:49pm    
Good idea, but the "classic" C++ part is not really suitable to be interpreted. Until OP's message, I am almost sure that interpreting "classic" C++ would be just impossible. The fact is: CINT is not a C++ interpreter. (!)
—SA
Sergey Alexandrovich Kryukov 15-Jul-14 14:02pm    
Anyway, I provided some answer, please see Solution 1.
—SA
enhzflep 15-Jul-14 14:02pm    
For what purpose would somebody make such a thing?
What advantage would this give over simply compiling the code?
Sergey Alexandrovich Kryukov 15-Jul-14 14:04pm    
Well, that is something I could imagine. For some of the purposes other interpreters have been created. It's just some user have imprinting with C or C++... :-)
Anyway, please see my answer.
—SA

Please see comments to the question, by bling (good idea about C++/CLI), and myself.

The fact is: CINT is not a C++ interpreter. CINT is not a C++ interpreter. This is clearly stated here:
http://en.wikipedia.org/wiki/CINT[^].

The language is some simplified language based on C and C++.

See also: http://root.cern.ch/drupal/content/cling[^].

How about one alternative mentioned in the Wikipedia article referenced above, "Ch"? This is not exactly C++ interpreter, this is an interpreter of some special language:
http://en.wikipedia.org/wiki/Ch_%28computer_programming%29[^],
http://www.softintegration.com/[^].

Anything else? "Pike", also only based on C and C++ (Wikipedia even says "influenced by"):
http://en.wikipedia.org/wiki/Pike_%28programming_language%29[^],
http://pike.lysator.liu.se/[^].

As to "real" C++ language, I have some believe that it is not suitable for an interpreter; the topic is just too complex to discuss my considerations in a Quick Questions & Answers. I cannot prove it right here and am not 100% sure, by tend to think it's possible to prove. I would be quite surprised if someone proved me wrong.

—SA
 
Share this answer
 
v2
Comments
Joren Heit 15-Jul-14 14:08pm    
I'm aware that CINT started as a C interpreter (hence the name), and that over time more and more C++ features have been added. By now, it supports templates, classes, inheritance, and much of the stuff that makes C++ C++. The fact that the language has been simplified to some extent does not mean it's not possible.

I must say I'm not at all familiar with C++/CLI (not a fan of platform specific languages; yes I know about Mono but let's be honest...), but in what way would the bytecode to be interpreted by the CLR be different from assembly code interpreted by a virtual x86 machine?
Sergey Alexandrovich Kryukov 15-Jul-14 16:57pm    
Not many understand that C++ is also a platform-specific language, probably you don't, too. It's just at the moment of C++ creation all platforms were of the same type. C++ looks universal, but its features was dictated by the architecture of platforms of the time of creation. Stroustrup failed to realize that, too. In CLI, the whole notion of the language is very different (and more strict, by the way); syntax is strictly separate from platform-specific semantic, and in C++ it is not. For example, operator "new" assumes quite certain platform features. Anyway, C++/CLI might not be applicable to your goals, but just know that pure .NET interpreters are quite possible and do exist. One example is PowerScript.

"It's not possible"? What "it"? You have no evidence of C++ interpreter. I just say, if the language is simplified, it simply cannot be called C++. So, from what I know, there is no a single C++ interpreter, and I suspect they are impossible. Now, if you want not C++, but "similar to C++. If such pseudo-C++ language is fine by you (it looks like it is), it's just fine, use them. That's why I referenced some alternatives.

—SA
Andreas Gieriet 15-Jul-14 15:47pm    
Hello Sergey,
if you allow the term "interpreter" to be used also for some intermediate (abstract assembly) language to be interpreted by some runtime environment, then I would conclude that any compiled language can be "interpreted". The tricky part is that you must provide the whole translation unit to the interpreter, and not only line by line. So, it would be some compilation into an intermediate form and interprete that. But then you could likewise compile directly into native code and call the executable ;-)
Cheers
Andi
Sergey Alexandrovich Kryukov 15-Jul-14 16:59pm    
Well, I depends on how you define interpretation. Maybe in some sense, you are right, but I did not get your point, frankly. The compilation into intermediate form and then interpretation is not something I would call "interpreter". You reduce the problem of interpretation to compilation, in rather trivial fashion...
—SA
Andreas Gieriet 15-Jul-14 17:37pm    
I follow the interpretation of Wikipedia: Interpreter (computing). The footnote even says: [...] In this sense, the CPU is also an interpreter, of machine instructions [...] which is a bit of a stretch, I'd say ;-) According to the above document, (immediate) interpretation of byte code or some AST can be regarded as interpretation. C++ can certainly translate into some AST, but as said in my first comment, needs to see the whole translation unit, since e.g. parsing classes and performing the needed name lookup to decide how to continue parsing, needs at least two passes. When you allow at least two passes in parsing to get to some intermediate form (byte code/AST/...), I guess, C++ can be transformed into the intermediate form and let that run by the interpreter.
I would be interested in what use cases this would be useful, though ;-)
Cheers
Andi
Quote:
I personally know of only 1 C++ interpreter, which is used by Root (CINT, developed at CERN). It made me wonder why there are so few, and the obvious answer is that C++ is a very complex language to parse. No one in their right mind would bother writing a C++ parser from scratch...

There are some important questions here: What is your goal and why would a C/C++ interpreter be the best (or at least a reasonably good) solution to that problem? "What is your goal?" and "Why is X a good solution?" are generally very good first questions to ask before wasting valuable time on implementing a solution.

C/C++ is impractical as a console based interpreted language for many reasons and people usually don't like wasting time on implementing tools that are not practical (well, maybe those who have a lot of time to have fun and learn and/or those who can't foresee the impractical aspects).

Implementing an interpreter for a dynamic language (like python) with dynamically typed variables and "dynamically linked functions" is orders of magnitude easier and less error prone than doing the same with C/C++. Writing the whole parser, compiler, interpreter for a simple dynamic language is quite easy and quick. The interpreter itself doesn't have to be very performant because it's usually used only to execute high level logic, glue code. If something is performance critical then you can implement it as a native C/C++ module for your interpreter and bind it to the dynamically interpreted language. In my opinion you haven't really thought over the problems that would arise with C/C++ interpreters. I could write a novel about them. Here are the most interesting problems that are big enough:

C/C++ requires forward declarations. In a practical interpreted language there are no forward declarations and there are no separated declarations/definitions. But this is just a tiny part of a more serious problem: C/C++ is impractically verbose to be used as an interpreted language. This is a set of problems that involve both language design problems and practical usability problems as an interpreted language. In python I can easily define an X function that calls a nonexistent Y function and I am allowed to define this Y function later. Then I can execute X. Later if I want I can completely change the definition of Y (maybe with some new default-initialized parameters to be backward compatible with the previous definition) and another execution of X immediately uses the new Y definition. I can execute X even if it calls a Z function that haven't yet been defined and this isn't a problem if the execution of X with the actual input parameters doesn't actually run into the branch that would call Z. In a C++ interpreter what happens if I change the definition of a very basic template (eg: dynamic array) that is used by a lot of previously defined functions (for example as inlined functions)? How do you find out which codeblocks need recompilation inside your interpreter in order to use the new inline functions? What happens if the recompilation of some of these functions fail with the new definition? You don't have serious problems like this with dynamic languages like python. Dealing with such scenarios in a non-dynamic interpreter would be a huge problem, and these were just two "simple/basic" problems, this is only the tip of the iceberg. I've been programming C/C++ for more than a decade and never felt the need for a C/C++ interpreter.

Quote:
However, there are C++ parsers are out there as part of compilers, some of which are open source. It occurred to me that it might be possible to interpret the assembly code that is emitted by the parser on the go. This would be like a JIT compiler that emits bytecode to be interpreted by a virtual machine. One problem that comes to mind is the use of different source-files which would otherwise lead to several object files that need to be linked. This is indeed a limitation, but not a very serious one if you ask me.

I started to write an answer but it became too long and complex as it analyzed your problem along many different cross-sections. Instead I write here just some of my conclusions.

You should answer the questions I've posted in the previous paragraph. From your question I think maybe your goal is simply creating A C++ interpreter with the least amount of work invested - I assume this because you want to reuse stuff (compilation units, object files) from existing compilers. My problem with this is that people do something with "least work invested" when they have to do something they don't like. On the other hand if they do something just for fun as a hobby or with a specific purpose then they usually don't mind doing a lot of work. In case of a C/C++ interpreter all solutions seem to involve a lot of work.

Depending on the compiler and the compiler settings the object files might contain something else than what you need. They are not standardized and even the same compiler can put completely useless garbage into them with some settings (like LTCG). IMO it would be more practical to write a backend to emit what you need instead of trying to parse it from object files.
 
Share this answer
 
v2
Comments
Sergey Alexandrovich Kryukov 15-Jul-14 23:18pm    
Very good points, a 5.
—SA
pasztorpisti 16-Jul-14 0:04am    
Thank you! "Every problem can be solved with an extra level of indirection" - even the ones I've listed but there are some problems I wouldn't like to deal with. Neither as the creator nor as the user of the interpreter...
Andreas Gieriet 16-Jul-14 4:51am    
Beside a lot of philosophical text, there remeins basically: "in python you do...", "is vorbose", "is impractical", "separation of declaration and definition", ... But you show no evidence.

The interpretes I used so far are kind of "shell interpreters" that allow to type in one-liners for quick tasks (based on a "library" of existing programs or functions or commandlets, or ...), and execute small to huge scripts (form one to many files and possible included and/or "precompiled" packages or libraries...).

If you ever worked with power-shell, you can imagine instatiating objects (e.g. by C++11 "auto" type) and calling function by some qualified name or on the object). You may define a function easily, e.g. void SayHello(string you) { cout << "Hello " << you << "!" << endl; } and call it SayHello("Andi");. This is a trivial example but not far away from other interpreted script languages. The needed #includes may be part of the environment (similar to .bash or alike where you put all the common parts in for convenience).

What concerns multiple object files: object files and libraries (as a container of object files) are only serialized memory representation of each C++ translation unit. The interpreter does not need to serialize what it holds in memory, and what is referenced from outside can easily be de-serialized into memory as if it was parsed directly.

I finally agree with you: the goal of the OP is very unclear. And I also agree that the task of having a C++ interpreter implemented is a big task.

But I do disagree that it would be more verbose than others, less practical than others, technically a problem.

Just a considerable amount to do (but you will learn a whole lot by doing so ;-)). Two to four years of hobby work beside your daily business?

Cheers
Andi
pasztorpisti 16-Jul-14 6:50am    
- What's wrong with using python as an example?
- C/C++ is more verbose than many other modern languages. This may not be a strong point for some people, for me it is in case of one liners. def SayHello(you): print("Hello %s!" % you)
- "Impractical", "separation of declaration and definition" without evidence: My answer doesn't try to be a full answer as that would be too long. Instead I've just listed a few problems that can make people think about the implementation of the solutions to these problems in the interpreters of these languages. The solution to the listed problems turn out to be very easy for simple dynamic languages and they would be more complex in case of a C/C++ interpreter. Of course this requires some practice/insight into the construction of a simple interpreter but I assumed that OP has this.
- Any language can be pre-compiled or interpreted and anything between the two. Even "shell script" languages, I never said its not possible. I just said that both the creation and the usage of a C/C++ interpreter would be more difficult than doing the same with some other (dynamic) languages.
- I would rather spend years of my free time to work on something that I find useful. If OP finds a C/C++ interpreter useful for some reason then it can indeed be a real goal.
- Object files: Depending on the goal of OP and on the actual interpreter implementation to reach that goal object files may be completely useless or impractical when it comes to creating an interpreter. But its not a problem as it is only a tiny detail how OP transfers data between the stages of his compiler. When I mentioned object files I was only brainstorming about easy and roboust reuse of the frontend of some existing compilers. In some cases writing a backend may be easier than parsing object files.
Andreas Gieriet 16-Jul-14 8:00am    
Nothing is wrong about Python. But it's about C/C++ and not what else could be useful. I find it more rewarding for every one to encourage people to look at a nice challenge (instead of listing all that *might* go wrong).

You list many points why *not* to do, but none *why* one might want to do. I did reply since most of your arguments have no evidence (i.e. I had to guess what item you mean when you make such claims like verbosity, declarations and definitions, etc.).

Lets partition the system into managable chunks and only then decide if it is worth to invent all, or glue together what is around (like frontends, intermiediate languages with the respective existing interpreters, let C++ transform into C and let CINT do the interpreter work, etc.). C++ is just *one* language - maybe you get to a basic concept that allows to have the backend built-up first (e.g. based on C/CINT) and do one/two/... frontends for that backend. Find out if this works, etc.

If *you* would spend time on that is irrelevant to the discussion. People are in general mature enough to decide if they want to spend time on something - be it hobby or be it for business - as long as they get a rough picture of the aimed outcome. ;-) Maybe an open source project would attract enough smart brains eager for challenge and to work with enthusiasm on it...

Others do collect stamps - it's their pleasure!
My speculative reason: There seems no one have had the need compared to the efford to do it. I.e. you might be able to build an Eiffel Tower in your garden, assuming you have the space, the material, the people, the good enough ground, etc. and you can do it for your own pleasure. ;-)

The difference from C to C++ is as stated in the C++ standard:
In addition to the facilities provided by C, C++ provides additional data types, classes, templates, exceptions, namespaces, operator overloading, function name overloading, references, free store management operators, and additional library facilities.

If going to C++11, you might add lambda expressions and several other parts, all mainly syntactic sugar in the sense that they abstract something that is mainly parser related and could also be done (if not tedious) by other means.

What results as runtime items from parsing the above features are
- additional data types: some more built-in char/numeric types
- class: extended instance functions with this parameter
- class: inheritance with base class as sub object of a derived class
- class: special implicit function calls (ctor, dtor, etc.)
- class: vtable for virtual function calls
- class: ordering of definitions (mostly) do not matter in classes - only name lookup is affected
- templates: generated classes or functions based on type or constant integral values
- exceptions: additional control flow and especially cleanup-bookkeeping code
- namespaces: "syntactic sugar" (the unuique types, objects, fucntions get defined and called)
- overloading: "syntactic sugar" (the parser creates calls to the correct functions)
- references: "syntactic sugar", results in pointers or no runtime code at all
- new/delete: low level memory malloc/free plus ctor/dtor calls
- libraries: collection of items that base on language features
- lambda expression: "syntactic sugar" (implicit functor class)
- ...

There are nasty detail problems to overcome, but technically it can be regarded as a much extended C.

So, if you have a C++ frontend that manages to translate into functionally equivalent C, then you arrived at the level of CINT. Problems like multiple files and referencing libraries and object files is not unique to C++ but is a common fact with all C* languages. So, you can re-use these concepts of the interpreter of those languages (CINT?).

But again, justifying the effort depends very much on the prospective "gain".

Cheers
Andi
 
Share this answer
 
Because C++ is a massive programming language. It is just too complicated. Interpreters are created for languages like Python, JavaScript not only that they are simple, but they are meant to be interpreted, they are dynamic. The first Python implementation was an interpreter; not a compiler. Not only is this the reason. It is already known that C++ is faster than interpreted languages when "compiled". But you can't always expect the same from a C++ interpreter. C++ is context sensitive; C++ is huge; C++ contains generics. Thus, the language complexity contributes to the slow processing of the source code. For example, See the compilation speeds of C++. What will happen if you interpret the source? That being said, a classic "hello world" program may not take a while. But think of a program with heavy usage of generic techniques?

But still there some interpreters like CINT, Ch. See the following page:
http://stackoverflow.com/questions/69539/have-you-used-any-of-the-c-interpreters-not-compilers[^]
 
Share this answer
 
Comments
pasztorpisti 18-Jul-14 22:36pm    
5ed, seems to be like a shorter version of my answer. I'm afraid in order to be able to really feel the difference in complexity between a C++ interpreter and the interpreter of a dynamic language one must implement at least a simple/small interpreter for a dynamic language first. Not necessarily for an existing full featured language, for this purpose a simple custom language would be enought with only a few features. After that its much easier to imagine how complicated can it be to deal with some C++ interpreter related problems. Even if the goal is creating a C++ interpreter for someone I would definitely recommend creating a simple interpreter for a dynamic langauge as a first project for learning as doing so requires much less time than going the long hard way. Creating an interpreter from scratch for a dynamic language can be done in a few days or weeks but I would estimate the creation of a C++ interpreter to take at least months or years even by reusing existing frameworks and I'm pretty sure there would be a lot of serious problems arising during development (and my guess would be that some of these problems would have no nice solutions because C++ wasn't at all designed to be interpreted - this is why I wrote about "language design problems" in my answer).
[no name] 19-Jul-14 0:16am    
Thank you. Some of the C++ language constructs makes no sense if it is interpreted. One of them is forward declaration. Forward declarations are not needed in an interpreted language. Though, a C++ interpreter does not ignore them. It checks all the declarations for syntax errors. Will it be efficient? C++ is mainly is used when performance is necessary. Dynamic languages are used when performance isn't very much required. Also C++ is a language with manual memory management. No interpreted language provides pointer arithmetic. No interpreted language is so verbose. And another important feature is template metaprogramming. Metaprogramming is not metaprogramming if C++ is interpreted. Even there isn't a need for a C++ interpreter.
pasztorpisti 19-Jul-14 8:56am    
My mind was spinning exactly around such problems. The number of these problems are endless. One of the most serious problems I was also thinking about is manual memory management that is never really needed in practice when you are toying around in an interpreter. An interpreter is the "calculator of a programmer" and even if you need something like "manual memory management" you usually need only an array, the interpreter is not a place for manual memory management that is usually done in order to optimize things at low level (for low memory consumption, better locality, etc...). In an interpreter you don't need tools that are there for low level optimization but in case of C++ you should deal with them and they would cause A LOT of headaches in some scenarios. We have just scratched the surface with problem listing but the number of WHY NOTs is already too high in my opinion to start writing a C++ interpreter.
[no name] 19-Jul-14 22:22pm    
Agree. Apart from all these problems, there is not any necessity for a C++ interpreter. Most of the C interpreters, if not all, were written as a hobby or for learning purpose. Thats because C is simpler than C++. But there are still few C++ interpreters like Ch, CINT. But they fail to support C++ fully. The Ch homepage says that,
"Many salient C++ features including classes, objects and encapsulation for object-based programming (Brain-damaging features are excluded)."
Ch doesn't support generics, so no STL, no boost. Then Ch must not use the name C++. How would one use C++ nowadays without STL? By writing custom containers?
pasztorpisti 19-Jul-14 23:44pm    
Sure. The short answer is indeed that simple but try to put it that way - some C++ advocates will chop off your limbs! :-)
It's hard to really "solve" this question - may have been better in a forum, but here's my 2c.

The language definition does not really preclude making an interpreter, but C++ was really designed as a low-level language that can be used for extremely efficient code. The language is extremely complex, but not impossible to parse (AFAIK it really requires a GLR parser or hand-written parser). That said, most of the goodness of C++ comes from the fact that it can be used to efficient code, that runs close to the machine. Interpreters are typically used for high-level tasks, where quick iteration during development is more important than speed of execution.

In short, it would be a hell of a lot of work, and would run like a dog.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900