|
Sure, we can always find difficult issues - the idea is to find ways to make our code more readable and understandable for whoever has to maintain it.
My style has changed a lot over the passed 44 years of coding and it is still changing as I discover better ways of making my code more reliable and more maintainable.
It's really frustrating, for example, when I am reading someone elses code that makes it impossible to tell a local variable from a class level variable - maybe that does not bother you but it does me.
I certainly welcome any ideas for making my code better; however, not doing the best that I can to let the reader know about the type, scope, and usage of a variable is a non-starter for me.
|
|
|
|
|
David Roh wrote: Sure, we can always find difficult issues - the idea is to find ways to make our code more readable and understandable for whoever has to maintain it.
Yes that is a common way to rationalize subjective preferences.
David Roh wrote: It's really frustrating, for example, when I am reading someone elses code that makes it impossible to tell a local variable from a class level variable - maybe that does not bother you but it does me.
Nope that doesn't happen to me.
What does happen to me is that I get frustrated when I realize the code has no error checking. Or is ignoring part of the API Is it supposed to be interacting with. Or it is just doing something wrong with the API. Or it has some horribly inefficient design based on a piece of code that by definition must handle high volumes. And other types of problems like that.
If my most serious problem or even close to most serious problem was confusion about the type of a variable then I would suspect that I had moved into an alternative reality.
David Roh wrote: I certainly welcome any ideas for making my code better; however, not doing the best that I can to let the reader know about the type, scope, and usage of a variable is a non-starter for me.
And I welcome ideas that objectively make code better. I do however realize that most ideas not not backed by anything objective. I also recognize as well that vast majority of development problems do not have anything to do with the minor subjective details of syntax. Similar to worrying about what color of nail polish works best while ignoring the symptoms of a heart attack.
|
|
|
|
|
Well, I am a firm believer in Hungarian and here is why - I believe that the name of a variable should instantly convey three vital pieces of information to the dev:
- it's type and I do mean type as in compiler typeof not semantics (regardless of what Charles did or did not intend)
- it's scope - is it local, a parameter, is it a passed in reference, a class level variable
- a clear indication of it's function or purpose
Different devs have their own view on how to write code; however, I believe that it's better to have a good logical reason for a belief rather than just a prejudice.
I believe that one of the reasons that Hungarian is considered bad is due to the way that many devs implement it - that is, each group uses very short abbreviations that are not clear to someone who is not part of the team so when you look at another teams code, you have to learn their abbreviations. Different teams have different rules and different ways of implementing Hungarian which make it a hassle to read another teams code.
I believe that it is not Hungarian that is the problem but rather the way that it is implemented that is the problem.
First we do very little abbreviation and only if it is an abbreviation that is so common that a noob would clearly understand it. We use longer names - yes we believe in self documenting code which means my manager should be able to read my code and understand what is being implemented in general terms (we don't however go to extremes).
If a passed in parameter is a reference that will live after a function is finished, then it is important to know that altering it can have long term consequences. If a variable is a class level variable, it is extremely important to know.
It should also be obvious that the more accurate the variable name clearly indicates it's purpose and function, the less potential there is for another dev to misunderstand or the original dev when returning to old code.
Saying that Intellisense will solve the problem or that we can mouse over a variable is not a solution because when we read code and think we understand, we rarely mouse over or use Intellisense on what we think we already understand.
Saying that Hungarian is bad because the type might change is a really weak argument - in the first place, types don't usually change that much; second, if a type does change, I certainly want to know; and lastly, with today's IDE refactoring capability, is really that hard to change a variable name (it's certainly easy for me).
We can argue about the best style to use to convey information to another dev reading our code, but I find it difficult to believe that any professional dev would say it's not important to be able to easily and quickly understand a variables:
- compiler type
- scope
- purpose
but I am sure some will.
It's about trying to find better ways to write reliable, easy to understand code - so what ever helps is a good thing.
|
|
|
|
|
Well argued.
I simply disagree.
Variable types change a lot more than you seem to be aware. Especially in an agile environment.
If a variable is given a descriptive enough name, then it's underlying type is already apparent in the name.
Hey, but if it works for you go for it. I'll never read your code and won't work somewhere where it's required.
If your actions inspire others to dream more, learn more, do more and become more, you are a leader." - John Quincy Adams You must accept one of two basic premises: Either we are alone in the universe, or we are not alone in the universe. And either way, the implications are staggering” - Wernher von Braun
|
|
|
|
|
David Roh wrote: Different devs have their own view on how to write code; however, I believe that
it's better to have a good logical reason for a belief rather than just a
prejudice.
Devs come up with all sorts of ways to rationalize that their personal subjective preferences are objectively better.
David Roh wrote: Saying that Intellisense will solve the problem or that we can mouse over a
variable is not a solution because when we read code and think we understand, we
rarely mouse over or use Intellisense on what we think we already
understand.
In my experience, I can't recall ever seeing a single bug traced to a misunderstanding of the type.
On the other hand there are vast other categories of bugs that show up over and over again and which cost businesses real money to find and fix.
So given your stict approach to just variable naming could you provide some detail as to what other process practices you mandate to insure that real and costly errors which impact development efforts do not show up as well?
David Roh wrote: ...but I find it difficult to believe that any professional dev
You must really work in some different domains than I do. In the sectors that I work am I am just happy to come across code (like whole methods and classes) that are easily understood from a functional point of view. Too often I come across code with problems like the following.
- Code that was obviously never tested fully
- Code that was never even used.
- Code that was not even appropriate for the business (no known use ever.)
- Code that does absolutely no error checking.
- Code that is so convoluted and has such a high coupling that it scares me just to look at it. And of course there is no unit tests for such code.
|
|
|
|
|
I don't care for Hungarian notation because it doesn't add any useful information.
Consider the following two variable names:
CustomerName
strCustomerName
There is no added value to the 'str' prefix in that example because both variables are obviously string variables. All my variable names have a tendency to make their type obvious - there is no reason to use short, confusing names where a particular notation is required.
The tool tip will tell one the type anyways.
|
|
|
|
|
CustomerName could also be a class. such as (in shorthand):
class CustomerName
String FirstName
String MiddleName
String LastName
Int32 SalutationID
Int32 TitleID
|
|
|
|
|
While I doubt anyone would create a CustomerName class, I think your point still stands.
Regardless, I've never found Hungarian Notation to have enough benefit to stick with it - and that is after having used it for years. If someone else wants to use it that's fine with me.
|
|
|
|
|
Same here. I neither recommend it nor discourage it. I do like the idea of a shared notation when working as a team.
|
|
|
|
|
Hungarian Notation expresses data type, but it does nothing for, and can even obscure, data semantics. I'm sure that was no part of Charles Simonyi's intent, but it does seem to work out that way quite often.
The old guideline that one should keep one's procedures / functions / methods short enough to be completely viewed on the monitor screen tends to produce the best legibility. If you can see everything at once, it's more difficult to go astray about either type or semantics.
My personal practice is to keep procedures as short as possible -- it's been a long time since I last wrote a procedure that can't be displayed in its entirety on the screen -- and to adhere to an inside / outside convention regarding variable names:
- Variables declared inside the procedure will have short names, with the exception of static variables whose significance extends beyond individual invocations of the procedure.
- Variables declared outside all procedures will have long, maximally descriptive names, since this is the space in which most problems of coupling and timing arise.
- Of course, those "outside-all-procedures" variables will be minimized in number, and protected from thread collisions with mutexes as appropriate.
Now, I'm a real-timer; my applications are always heavily multi-threaded, and I'm always intensely concerned with attaining a reliably predictable response time to any imaginable event. If you do other sorts of programs, you're likely to have different desiderata...but I can't imagine that the conventions described above would harm you, even so.
(This message is programming you in ways you cannot detect. Be afraid.)
|
|
|
|
|
I used Hungarian notation in C++ all the time as it is close to the machine and so things like pAddress and ppAddress and dwAddress and lpdwAddress and wAddress, or sStructure and tTypedef etc are very useful distinctions.
With higher level languages I tend not to annotate strings, and booleans are usually posed as a question like hasLoaded, isUserFemale. Though annotations to suggest private / protected variables like the underscore (_) I still find useful. We don't often have to consider whether something is 16, 32, or 64bits long though, and we rarely access raw pointers so pAddress and ppAddress have less use, and sStructure and tTypedef are generally just full classes at the detriment to performance. Also we use var a lot and let the compiler define the type so this is another factor.
The next factor is the IDE, if the solution is fully integrated like in MS Visual Studio then you can easily go to the definition or hover over the variable so it is less necessary. Though as you say C&P'ing snippets elsewhere and you won't always know what they are though you can usually guess primitives and anything complex you will need the class/structure/typedef anyway.
So I would say - carry on for C/C++, ASM etc but for higher level language where vars are often chosen at compile or even runtime then sometimes it is ok to leave it out.
|
|
|
|
|
VuNic wrote: I just open it in notepad and try to figure out what datatypes all these variables are.
Well, I might not know why you're using notepad instead of VS. But couldn't you at least use Notepad++?
I mean, you can't be doing this for nostalgia purposes only. Or could you?
"To alcohol! The cause of, and solution to, all of life's problems" - Homer Simpson
"Our heads are round so our thoughts can change direction." ― Francis Picabia
|
|
|
|
|
Before writing the coding standard, I would recommend reading "Code Complete" which gives some very specific arguments about naming conventions and other things that would be in the standard. The good news is that no matter what, just HAVING the standard, even if it's not perfect, will go a long way.
For our company which is largely focused on embedded projects using C, we adopted one that was already well developed (Michael Barr's Netrino embedded standard), bought a few hard copies for our reference, and made a one or two page document that says that's what we're using and what changes to it we're implementing (very few tweaks). Taking this route (adopting an already written standard) saved us a TON of time (read money) and took out some of the "persona preference" discussions.
If you're using C#, why not adopt Microsoft's standard and call it a day?
(I realize this isn't exactly what you were asking about.)
|
|
|
|
|
Back in my embedded C and MFC days I used to be big on hungarian notation. No more. These days I am working mostly with C# and stay away from prefixes as much as possible for a few reasons.
1. Readability
As programmers we spend most of our time reading code not writing it. What's easier to read:
sAccountNumber = CreateAccountNumber(nId, sLastName, wUniqueId);
or
accountNumber = CreateAccountNumber(id, lastName, uniqueId);
2. Maintenance
Let's say I no longer want to use a string to store my account number. I want to encapsulate it inside a class. If you used accountNumber variable to begin with, you don't need to worry about renaming it. It's still just an accountNumber
With modern compiler GUIs you really don't need to encode the variable type in it's name. You can hover with your mouse over it, and find out immediately whether it's an int or a string, or some other user defined type. In fact, you should try to avoid thinking about storage types as much as possible, and program at a higher level of abstraction. Only when you get to the low level code, take care of the types.
I highly recommend Robert C. Martin's "Clean Code"[^], particularly Chapter 2: Meaningful Names apply to this discussion.
"There are only 10 types of people in the world - those who know binary and those who don't."
|
|
|
|
|
I prefere Hunarian notation for the same reasons. We don't always have an IDE handy when reviewing code. I use "m_" for module-level scoped variables, and no prefix for variables local to a given method or property.
So, what prefixes would you use for the following (and their array counterparts where arrays make sense):
Int16
Int32
Int64
UInt16
UInt32
UInt64
String
DateTime
Single
Double
Boolean
Dictionary/Dictionary<>
List/List<>
or a variety of other objects?
Your choices could be helpful to us Neanderthals still using Hungarian notation?
-- modified 5-Apr-12 11:47am.
|
|
|
|
|
I don't even understand the question. If your C++ program is not broken up into .cpp files of less than 1000 lines, coupled usually 1-1 with .h files of less than 100 lines, there is far more wrong with your coding than hungarian notation. You know what the data type of a variable is because each class only has a few member variables, and their types are given in the .h file.
If you have a boatload of global variables, or big sprawling files with a zillion classes in them, or you're declaring members public and peeking inside other classes to look at their variables, I could see how you'd have a problem remembering the types. But this is just bad practice.
Hungarian notation solved a problem of the 1990's C-based world, where you had thousands of globals in a big project. That problem belongs to the past. If you have that problem now, then your code belongs to the past.
|
|
|
|
|
Because it's not the 90's anymore and there are better ways to know what a variable type is?
You should be using an editor that will tell you what the type is; then you will *know*. You will know even when the type changed from signed to unsigned or from short to long.
Besides it has been my experience that most people use Hungarian notation so they don't have to come up with decent variable names:
I always see functions with crap like this:
int iValue;
char *strzValue;
double dValue;
float fValue;
|
|
|
|
|
I always publish a data dictionary with every project and a style guide for any deviations from "standards" and to define which "standard" I am using. A single file holds the definion and type for each variable. Easy to find the datatype and you actually know what the variable means in the real world. This also helps keep business variable names distinct and consistant across an entire project (e.g., never "name", always clientName, providerName, and clientName everywhere) which I feel is a critical practice. I also publish "global" names that are reused. (e.g., ndx is always used for the innermost indexer in loops). You can write scripts that will find most variables and add them to a dictionary.
For Property-backs I use _whatever so I know I will have a Property Whatever on the object. Though I sometimes shortcut when it is very clear. (e.g., On the "Client" object I may have _clientName on the back, but "Name" on the Property. I always (and only) use type prefixes for GUI object types in stuff like ASP.net (e.g., "asp:Button btn_SaveClient") and always publish the prefixes btn =button, tb =textbox in the data dictionary. This really helps in the code behind when VS creates event handler names for you - tb_Client_OnFocus, tb_Client_TextChanged.
Every naming convention is only designed to make code clearer. A data dictionary does this best (IMHO), and if you inherit multi-convention code this really helps "fix" it. (e.g., if in one file I have clntCode and in another ClientID and another intClient I can add them all to the dictionary, make sure I understand them, and then find-replace for each - clntCode int; -> clientCode Int32; ClientID decimal(6,0); -> clientCode Int32; Not perfect, but a good start.
Which convention doesn't matter if you are consistant and document well. A Data Dictionary does this, provides a big step in documentation, and helps provide a global view of your application.
|
|
|
|
|
I was on a team a few years back tasked with writing coding standards. We started to define prefixes for all commonly-used types and user controls. We quickly realized that there were just too many variations for us to account for in modern Object-oriented systems. This includes dozens (or hundreds) of classes as well as custom subclassed UI controls.
Additionally, Visual Studio (and other IDEs) are kind enough to display tool-tips indicating the data type. I also frequently use the "Go to definition" feature to further explore the origin of variables or types.
It's important to remember that the type of some variables can change over time (i.e. Int32 to Int64 or Double to Decimal). Coupling the variable name to its type just increases the potential need for future refactoring that could be avoided by simply using a more generic (but still descriptive) name.
I realize the big exceptions to the IDE-sugar are when you print code or view it in a plain text editor. But what my team decided is that these were very rare in our situation.
We ended up recommending a "good descriptive name" for variables and UI controls. We realized that with the Intellisense-driven Visual Studio experience you only have to type a variable or control's name ONCE (in most cases) and Intellisense will assist with all subsequent references to it.
|
|
|
|
|
I actually worked at a place that was very strict about Hu notations. After getting used to it, the whole thing felt super natural and I really enjoyed it. The conventions were "sane" hungarian notations, as you can go overboard with it. We had maybe 10 or so prefixes everybody knew and used. Maintaining the code was very fun. I wish more places would adopt those notations. We were coding in C, so I don't know if it would work the same for C++.
|
|
|
|
|
VuNic wrote: Though it's C++ or C#, we do have primitive data types everywhere. In fact, for
smaller projects, primitive data types would account for 90% of the variables.
I doubt that is generally true.
Especially since, presumably, you are not claiming that one should use hungarian for local variables.
VuNic wrote: How do I know what datatype it is?
Context and naming. Are you unsure about the types of the following variables?
fullName = lastName + " " + firstName;
connection.Commit();
for(int i=0; i < accounts.Count; i++)
VuNic wrote: For example, the code is a 100K line code and I cannot copy the entire project
to my disk to review that at home.
Say what?
Your "disk"? Where exactly do you work? Exactly what kind of "computer" do you have at home?
My memory stick is old and cheep when I bought it yet it holds everything I need to move entire code trees (plural at the same time.) Just one of the code trees that I commonly move around has 7,000+ files on it and probably 90% of those are source files (and I do not do GUIS, so there are no image files of any sort.)
So if you presume 5000 files with 100 lines on average then that one code tree has 500,000 lines of code. I suspect the actual count is higher.
And how exactly are you going to "work" on something at home if the piece you are working on, in some context, is not complete?
VuNic wrote: f anybody has a valid reason against it, I'm all ears to it.
Because it will not provide any measurable benefit AND because as a standard it will annoy probably everyone else except you.
And morale has been proven to have an impact on productivity.
|
|
|
|
|
It's a subjective preference of course, so there's not absolute reason(s). Back in the day I used Hu notation; we pretty much all did when it was the "thing you should do".
However, here are some points that collectively tilt the scale towards not using Hu these days.
* Hu originally solved a problem, the difficulty of quickly differentiating typed variables in primitive development environments.
However the modern use of more powerful IDE's having more robust intellisense or similar capabilities that make it trivial to "see" applicable metadata about variables (etc) addresses the need for a quick means of figuring out what is what.
Though you say you like to take snippets of logic to review in a simple text editor rather than "have the entire project" local and use your IDE of choice, I would think that you are an exception. Using source repositories it is trivial and normal to have a copy of all source on your workstation or laptop, to work "disconnected", and to synch delta on either side up with the repository when connected and you wish to. I'm not sure what benefit you would gain from working with snippets in a "dead" environment, when it is so easy to work on a compilable, runnable local version.
For that matter, these days the source repository is generally only a VPN connection away unless you are on the road or in the air in most corporate environments.
This usage of snippets to take home may be an old habit of yours that bears reevaluation; maybe it's no longer necessary.
* In the style of OOP currently en vogue, injection is often favored. Many things that used to be variables are often parameters or properties now.
Any localized variables tend to be very localized, as in declared very close to their usage, and at method scope or even less (lambdas, etc).
Short, concise methods are greatly preferred over long rambling ones, making it much more likely that variable declaration is in close proximity to its usage.
* Refactoring is commonly practiced, ideally altering implementation while maintaining functionality. What is a double today might be an int tomorrow; a string might become a char[] or vice versa. Ideally the name shouldn't need to change.
Related to this, is the consideration of once and only once; if the name of a variable contains some token or segment that is a direct reinstatement of its (original) underlying type, you have effectively duplicated the information. If the underlying type changes, the "duplication" of the original type information should also change, but might not causing a classic information synching issue.
* In C#, the usage of the var key word and type inference at compile time is a pretty useful feature, but would not cooperate very well with Hu notation.
Something like:
var intResult = SomeMethod(...);
is obviously kind of silly...especially if SomeMethod(...) later gets refactored to return int? vs int, or any other type at all.
Also...what is the Hu notation for a nullable primitive?
int? nintResult;
int? n_intResult;
int? nIntResult;
Not loving that idea.
* .NET and C# in particular takes some pains to smooth the differences between value types and reference types, and encourages you to use an instance of a primitive object on the stack similarly to an instance of a reference type on the heap aside from the obvious localized scoping of changes to value types. As complex types tend to be much more common in most applications, naming conventions have evolved to tend to favor complex types more than primitive types.
Complex types with camel cased names like MyVerySpecialClass : MyVeryAbstractBase Hu oddly if you are of a very literal bent...
MyVeryAbstractBase mvabSubclass = new MyVerySpecialClass();
Early on in .NET, and this seemed to be encouraged by MS based on writings and examples of the time, tended to just use a sort of acronymization in a kind of vague Hu, something like this:
MyVeryAbstractBase mvab = new MyVerySpecialClass();
or, vexingly, an implicit assumption of specific concreteness such as:
MyVeryAbstractBase mvsc = new MyVerySpecialClass();
of more irritating, unnecessary concrete declaration:
MyVerySpecialClass mvsc = new MyVerySpecialClass();
This habit seems to be falling away finally, though it still seems to crop up in throw away variables in for's, foreaches, and lambdas where the brevity is basically harmless due to the very narrow scope (though I still prefer clear, full names).
Having one naming convention for primitives (Hu, for instance), and a different naming convention for complex types (PascalCase, for instance) works against the efforts made to unify usage of both.
Having said all of that, the MS Press Framework Design Guidelines book has a a couple of pages on Hungarian notation and why it is restricted / discouraged in .NET.
A key point is, privately scoped members are not held to the same rigor as non-privately scoped members. If you want to name all your private variables using Hu, then only the other people on your team or future devs who land on your code base later are going to care.
Another key point is the main value of conventions is consistency. It's better to be consistent and "wrong" per current prevailing mores, that it is to inconsistently adhere to them. Personally I find this to be a mark against Hu as it is more difficult to remain consistent over time if underlying data types change, but still.
Personally...the only place I still use Hu notation is in designer / template centric environments such as .aspx web pages where type declarations are effectively split between the template and the backing object.
Thus, I find a prefixed notation like this to be handy:
<asp:textbox runat="server" id="tbSalutation" ...="">
<asp:textbox runat="server" id="tbUserName" ...="">
...SomeBackingBindMethod(SomeDataRecord record)
{
if(record == null) return;
tbUserName.Text = record.UserName;
tbSalutation.Text = record.Salutation;
...
}
There is a nice secondary benefit of having an immediate mnemonic of what sort of control object I'm dealing with without needing to look at the designer as often, but the MAIN benefit the prefix gives me is that it naturally groups similar types of control for intellisense usage.
I don't have to remember the exact name of every textbox or checkbox or label; I can just type "tb..." and intellisense gives me an assist by painting a type down list of all the textboxes thus prefixed allowing me to hone in on the one I want quickly.
This is very pragmatic in my opinion, and also helps bridge the gap between OOP and parsed / merged view templates.
|
|
|
|
|
I would put forward a couple of reasons that Hu notation is not needed and why I do not use it.
1) All the names I assign within my objects (properties, methods, fields, etc.) derive there name based on what the business owners call it in the real world if there is a corelation. As a developer I can only do my job well when I also understand the real world objects that my code represents. As such a new developer must obtain some level of this knowledge for the data types to be evident. The example of AccountNumber in above posts should be understood within the logical context of your system. If I call a variable accountNumber then likely I have some idea of what that account number is within the context of my system. Hu notation will not tell me anything more than the common business knowledge will. The reality is that if I name it strAccountNumber all I know is that it is a string of some length. I would not be able to know simply from the "str" if the business rules allow special characters, if the max length was 8 characters, etc.
2) Modern IDE's virtually remove the need for Hu notation. If the declaration of a variable is not in view and the data type is not apparent based on my knowledge of the business domain then (as a .NET developer) all I have to do is hover over the variable and the type is immediate shown thanks to intellisense.
|
|
|
|
|
I used to use Hungarian notation when working in a weakly-typed language, but quickly gave it up when working with C# and .NET.
It wasn't just because C# is strongly typed; it was also because of:
- Property names and data binding
- Mapping database table and column names to classes and ORM's
- Serializing / deserializing objects to / from XML
- Configuration files
- Reflection.
Hungarian notation might be fine if your "names" never see the light of day; in most other cases, it gets ugly and geeky really fast.
|
|
|
|
|
Hungarian notation is very useful as long as you don't concatenate too many prefixes and you only use it when it actually helps.
I find the 'm_' prefix for class member and the 'p' prefix for pointer very helpful and will protest at code that doesn't use it. After using 'm_', I feel free to add another prefix as in 'm_pCursor' as this remains easy on the eye. After using the 'p' prefix I really want a capital letter so that the p stands alone and is not mistaken for anything else. For me, prefixing the data type is secondary to this as most IDEs will tell you the data type simply by hovering the mouse.
I use 'n' when my logic requires a positive whole number and 'i' when my logic requires an iterator although its implementation may be size_t or int depending on the need to use -1 to represent it pointing nowhere.
I use prefixes for controls btnName, edtName.
I use established concatenations such as lpzString but try to avoid creating new ones.
|
|
|
|
|