|
ha... some of the data was related to the currently much talked about vulnerability, I see...
Mmm.. I am only at 2/4+
|
|
|
|
|
The original code was a bunch of preprocessor macros.
This is compile time if statements and lots of const folding.
It's deliberately unrolled so it's as inline as possible - this is very timing sensitive.
If you think this is bad, you should have seen the original code.
if(has_data_low_pins && has_data_high_pins) {
uint32_t pins_l = gpio_input_get();
pins_l = gpio_input_get();
pins_l = gpio_input_get();
uint32_t pins_h = gpio_input_get_high();
if(pin_d0>31) {
b = (((pins_h>>((pin_d0-32)&31))&1)<<0);
} else if(pin_d0>-1) {
b = (((pins_l>>(pin_d0))&1)<<0);
} else {
b=0;
}
if(pin_d1>31) {
b |= (((pins_h>>((pin_d1-32)&31))&1)<<1);
} else if(pin_d1>-1) {
b |= (((pins_l>>(pin_d1))&1)<<1);
}
if(pin_d2>31) {
b |= (((pins_h>>((pin_d2-32)&31))&1)<<2);
} else if(pin_d2>-1) {
b |= (((pins_l>>(pin_d2))&1)<<2);
}
if(pin_d3>31) {
b |= (((pins_h>>((pin_d3-32)&31))&1)<<3);
} else if(pin_d3>-1) {
b |= (((pins_l>>(pin_d3))&1)<<3);
}
if(pin_d4>31) {
b |= (((pins_h>>((pin_d4-32)&31))&1)<<4);
} else if(pin_d4>-1) {
b |= (((pins_l>>((pin_d4)&31))&1)<<4);
}
if(pin_d5>31) {
b |= (((pins_h>>((pin_d5-32)&31))&1)<<5);
} else if(pin_d5>-1) {
b |= (((pins_l>>(pin_d5))&1)<<5);
}
if(pin_d6>31) {
b |= (((pins_h>>((pin_d6-32)&31))&1)<<6);
} else if(pin_d6>-1) {
b |= (((pins_l>>(pin_d6))&1)<<6);
}
if(pin_d7>31) {
b |= (((pins_h>>((pin_d7-32)&31))&1)<<7);
} else if(pin_d7>-1) {
b |= (((pins_l>>(pin_d7))&1)<<7);
}
} else if(has_data_low_pins) {
uint32_t pins_l = gpio_input_get();
pins_l = gpio_input_get();
pins_l = gpio_input_get();
if(pin_d0>-1) {
b = (((pins_l>>(pin_d0))&1)<<0);
} else {
b=0;
}
if(pin_d1>-1) {
b |= (((pins_l>>(pin_d1))&1)<<1);
}
if(pin_d2>-1) {
b |= (((pins_l>>(pin_d2))&1)<<2);
}
if(pin_d3>-1) {
b |= (((pins_l>>(pin_d3))&1)<<3);
}
if(pin_d4>-1) {
b |= (((pins_l>>(pin_d4))&1)<<4);
}
if(pin_d5>-1) {
b |= (((pins_l>>(pin_d5))&1)<<5);
}
if(pin_d6>-1) {
b |= (((pins_l>>(pin_d6))&1)<<6);
}
if(pin_d7>-1) {
b |= (((pins_l>>(pin_d7))&1)<<7);
}
} else {
uint32_t pins_h = gpio_input_get_high();
pins_h = gpio_input_get_high();
pins_h = gpio_input_get_high();
if(pin_d0>-1) {
b = (((pins_h>>((pin_d0-32)&31))&1)<<0);
} else {
b=0;
}
if(pin_d1>-1) {
b |= (((pins_h>>((pin_d1-32)&31))&1)<<1);
}
if(pin_d2>-1) {
b |= (((pins_h>>((pin_d2-32)&31))&1)<<2);
}
if(pin_d3>-1) {
b |= (((pins_h>>((pin_d3-32)&31))&1)<<3);
}
if(pin_d4>-1) {
b |= (((pins_h>>((pin_d4-32)&31))&1)<<4);
}
if(pin_d5>-1) {
b |= (((pins_h>>((pin_d5-32)&31))&1)<<5);
}
if(pin_d6>-1) {
b |= (((pins_h>>((pin_d6-32)&31))&1)<<6);
}
if(pin_d7>-1) {
b |= (((pins_h>>((pin_d7-32)&31))&1)<<7);
}
}
Real programmers use butterflies
|
|
|
|
|
Seems OK. Any problems?
Looks like a code reflecting some hardware documentation table.
modified 15-Dec-21 4:11am.
|
|
|
|
|
Maintenance, testing, readability.
It's kind of messy in that regard.
It drives an 8-bit parallel bus using software. This is part of the code anyway.
Real programmers use butterflies
modified 15-Dec-21 11:48am.
|
|
|
|
|
How would you refactor that code so that it is mockery proof? Is it even possible? Just curious.
|
|
|
|
|
Well, I could make more inline functions to wrap that pin shifting, and probably reduce the number of if blocks, but that doesn't mean i'm going to.
Real programmers use butterflies
|
|
|
|
|
I would stick with the preprocessor macros myself. The results will be the same and it will be MUCH cleaner.
One thing that is puzzling is the gpio_input functions are called three times. Is that because of timing reasons?
"They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"
|
|
|
|
|
Yes, it is. There's a comment to that effect the *first* time I do it in the code.
I don't believe the results are cleaner. For starters, you should have seen the nested #ifdefs it took to do this.
Second, the preprocessor method of doing this suffered from a serious design difficiency.
You couldn't use multiple static "instances" of that to drive multiple buses, which is a problem when you have a device that runs more than one display, or even more than one SPI device (doesn't apply to the parallel code but in principle it could)
Everything defined in this code is inside a template, meaning the statics are one-per-instantiation and the arguments to the template are the pin assignments. Ergo, for each different collection of pins tied to a bus, you have a different set of statics to work with, meaning you can drive multiple displays.
I'd also argue this is cleaner because it's all typed, whereas the preprocessor macros are not. That matters for more than safety. These days it also means better intellisense/autocomplete, which means more productive mucking about with the source.
Real programmers use butterflies
|
|
|
|
|
I guess I disagree. With this macro :
#define SetPinA( pin, shift ) \
( pin > 31 ) ? ( ( pins_h >> ( ( pin - 32 ) & 31 ) & 1 ) << shift ) : \
( ( pins_l >> pin ) & 1 << shift ) I think the following code is much, much cleaner.
if( has_data_low_pins && has_data_high_pins )
{
b |= SetPinA( pin_d1, 1 );
b |= SetPinA( pin_d2, 2 );
b |= SetPinA( pin_d3, 3 );
b |= SetPinA( pin_d4, 4 );
b |= SetPinA( pin_d5, 5 );
b |= SetPinA( pin_d6, 6 );
b |= SetPinA( pin_d7, 7 );
} but that's just me. Your mileage may vary.
"They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"
|
|
|
|
|
We're talking about two different things.
You're talking about combining macros with this approach. That's not what I mean at all.
I mean ALL of this code. All of it. Was preprocessor macros. The if()s were #ifdefs
The reason that I wouldn't do what you're doing is I am a stickler about namespace polution.
As such I use preprocessor macros sparingly, and usually if they end up in my code 9 times out of 10 it's because of "Other People's Code" and the other 1 is usually because a #define/#ifdef is easy to -D when you go to compile the code.
Real programmers use butterflies
|
|
|
|
|
There is a difference between overusing a language feature and refusing to use it ever.
honey the codewitch wrote: I am a stickler about namespace polution
Myself I like to consider the maintenance cost of the code that I write. Maintenance costs will always be at least two times the initial cost and often will go to 10 times. And 100 times are not unheard of.
So how I write the code now leads to my consideration of what form the code should take. The form suggested by the other response seems much more maintainable.
Hard to say what your original code looked like and presuming that you mean "namespace" in the general sense rather than the C++ key word, then limiting the scope of the macro to the file itself means there would be no namespace pollution. It is not necessary to put everything in headers - it is just a convention.
But if you put the macro in a header you could even unit test it.
|
|
|
|
|
jschell wrote: The form suggested by the other response seems much more maintainable.
After perusing the thread, because it has been a minute, the only proposal that I am not already doing, or wouldn't make sense to do since it moves compile time actions to run time, was to leave the initial preprocessor macros, but those original preprocessor macros were never presented, so I'm confused as to how you could say it was more maintainable having never seen it.
Furthermore, it is not feasible to limit said preprocessor macros to a single file as doing so would increase maintenance since several processors with their own instructions are presented, each in its own header. To combine them would create a monolithic header that included every single build target. No.
Finally, unit testing this is not feasible, as I don't have the money to set up the hardware array necessary to unit test this across all build targets, nor the toolchain infrastructure necessary to make that doable, and would probably have to move away from using PlatformIO to do so anyway, which would make the effort required to do what I am doing snowball once I'm stuck with CMake builds using LLVM with gcc in funky ways. And even if I had the money, the time, and the software infrastructure to make that work, I don't have the physical space to set it all up.
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: so I'm confused as to how you could say it was more maintainable having never seen it.
A specific code sample was presented in a comment. Then you said.
"The reason that I wouldn't do what you're doing is I am a stickler about namespace polution."
You statement strongly suggested that you were responding to the code sample.
I did of course see the code sample in the posted response and then I responded to your comment about that code sample.
honey the codewitch wrote: Furthermore, it is not feasible to limit said preprocessor macros to a single file as doing so would increase maintenance since several processors with their own instructions are presented, each in its own header. To combine them would create a monolithic header that included every single build target. No.
No idea what you are talking about.
If you have one and exactly one C or C++ file (a .c or .cpp file) then you can define macros at the top of that file. The scope of those macros, in both languages are limited exclusively to that file. That is how those languages work.
If you have several files that use the same code then you can put the macro(s) and only those macros in an h file and then use that h file only in those code files. That means specifically that you do not put it in some other h file for convenience.
Now if you have the code snippet (not the macros) in an h file, then I suggest moving it out of the h file because it should not be in there in the first place.
honey the codewitch wrote: Finally, unit testing this is not feasible, as I don't have the money to set up the hardware array necessary to unit test this across all build targets
Huh?
A "unit test" in the context my statement would never be tested in that fashion. You are testing the macro not the system.
|
|
|
|
|
Presently the macros are in about 6 h files.
I did not write it that way, and I'm not rewriting six files of macros.
I am not unit testing all of this garbage because it's impossible. It's all timing sensitive and each one is different for each piece of hardware. No.
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: I did not write it that way, and I'm not rewriting six files of macros.
As I already said in my previous post...Your statement strongly suggested that you were responding to the code sample.
To be very clear - that would be the code sample that was posted in this thread and not your code.
That code sample does not require "six files of macros".
|
|
|
|
|
honey the codewitch wrote: the *first* time I do it in the code. I had heard of weird places to do it, but in the code....
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|
|
Yep, you are right, I would not want to see this preprocessor macros, that would fry my brain.
Could you put the pins_l and and pins_h into a single uint64_t or is that not supported?
That would make the processing easier.
Failing that you could simplify it a little.
if(pin_d0>31) {
b = (pins_h>>(pin_d0&31))&1;
} else if(pin_d0>-1) {
b = (pins_l>>pin_d0)&1;
} else {
b=0;
}
I am sure there are other optimisations that could be done, but I am done for the day. Good luck!
|
|
|
|
|
It would require more shifts because those routines that fill it return uint32_t
Also it supports it but the proc is native 32-bit and I don't want to have any hidden overhead for uint64_t.
Finally, in some of the code paths, I am only using pins_l, or pins_h. Only in one fork am I using both.
Real programmers use butterflies
|
|
|
|
|
I realise you're not asking for suggestions but I would certainly be minded to simplify this, particularly if this is for general purpose use. If you're in a situation where this is for an embedded platform where the hardware configuration is fixed you should gain both better readability and performance if you can spare a little RAM. Something like this:
const int NUM_BUS_BITS = 8; const int NUM_GPIO_PINS = 40;
static uint32_t sGPIOLow, sGPIOHigh;
static struct
{
uint32_t* pSource; unsigned shift; } sDataMap[NUM_BUS_BITS];
static void mapInputPin(unsigned dataBit, unsigned gpioPin)
{
if (dataBit < NUM_BUS_BITS && gpioPin < NUM_GPIO_PINS)
{
sDataMap[dataBit].pSource = gpioPin >= 32 ? &sGPIOHigh : &sGPIOLow;
sDataMap[dataBit].shift = gpioPin & 31;
}
}
static uint8_t readInput(void)
{
sGPIOLow = gpio_input_get();
sGPIOLow = gpio_input_get();
sGPIOLow = gpio_input_get();
sGPIOHigh = gpio_input_get_high();
uint8_t dataRead = 0;
dataRead |= (((*sDataMap[0].pSource) >> sDataMap[0].shift) & 1) << 0;
dataRead |= (((*sDataMap[1].pSource) >> sDataMap[1].shift) & 1) << 1;
dataRead |= (((*sDataMap[2].pSource) >> sDataMap[2].shift) & 1) << 2;
dataRead |= (((*sDataMap[3].pSource) >> sDataMap[3].shift) & 1) << 3;
dataRead |= (((*sDataMap[4].pSource) >> sDataMap[4].shift) & 1) << 4;
dataRead |= (((*sDataMap[5].pSource) >> sDataMap[5].shift) & 1) << 5;
dataRead |= (((*sDataMap[6].pSource) >> sDataMap[6].shift) & 1) << 6;
dataRead |= (((*sDataMap[7].pSource) >> sDataMap[7].shift) & 1) << 7;
return dataRead;
}
int main(void)
{
mapInputPin(0, 4);
mapInputPin(1, 5);
mapInputPin(2, 6);
mapInputPin(3, 7);
mapInputPin(4, 36);
mapInputPin(5, 37);
mapInputPin(6, 38);
mapInputPin(7, 39);
uint8_t readData = readInput();
return readData;
}
The readInput() function is basically what you have above but doesn't require the conditionals since you've defined ahead of time where data should be read from and how it should be shifted. It reads both low and high GPIO registers but I doubt that will be noticeable but could be simplified prior to the decoding if absolutely necessary.
The multiple reads from the GPIO bus is a bit of a red flag, however. I am not familiar with the platform (I'm guessing ESP32) but that would usually suggest to me that a hardware clock hasn't been set up correctly or that your timing is marginal. If it's the latter then I'd suggest you need a more robust delay mechanism as you may be lucky on this board but it'll be intermittent on others.
|
|
|
|
|
Those conditionals aren't executed at runtime. They are constexpr, as are most of the shifts, and the pin #s themselves.
The reason for the multiple reads is timing. There is no hardware clock involved other than the CPU clock itself as the ESP32 does not have a hardware parallel 8-bit interface. It's all bit banging.
The other ways I could have done it would have required me to drop to asm to insert the appropriate amount of delay because the arduino and idf framework delays are not granular enough. I need to cycle count. I'm not putting asm in my code, because ESP32s don't even use the same CPU across all iterations of them.
Real programmers use butterflies
|
|
|
|
|
OK - it wasn't immediately clear that the bit mapping was constant at compile time. Personally, I'd still be inclined to do something like this but would want to confirm it was optimising as expected (which I'm sure you've done with the existing code):
const unsigned pin_d0 = 4;
const unsigned pin_d1 = 5;
const unsigned pin_d2 = 6;
const unsigned pin_d3 = 7;
const unsigned pin_d4 = 36;
const unsigned pin_d5 = 37;
const unsigned pin_d6 = 38;
const unsigned pin_d7 = 39;
static uint8_t readInput(void)
{
uint32_t GPIOLow = gpio_input_get();
uint32_t GPIOHigh = gpio_input_get_high();
uint8_t dataRead = 0;
dataRead |= (((pin_d0 > 31 ? GPIOHigh : GPIOLow) >> (pin_d0 & 31)) & 1) << 0;
dataRead |= (((pin_d1 > 31 ? GPIOHigh : GPIOLow) >> (pin_d1 & 31)) & 1) << 1;
dataRead |= (((pin_d2 > 31 ? GPIOHigh : GPIOLow) >> (pin_d2 & 31)) & 1) << 2;
dataRead |= (((pin_d3 > 31 ? GPIOHigh : GPIOLow) >> (pin_d3 & 31)) & 1) << 3;
dataRead |= (((pin_d4 > 31 ? GPIOHigh : GPIOLow) >> (pin_d4 & 31)) & 1) << 4;
dataRead |= (((pin_d5 > 31 ? GPIOHigh : GPIOLow) >> (pin_d5 & 31)) & 1) << 5;
dataRead |= (((pin_d6 > 31 ? GPIOHigh : GPIOLow) >> (pin_d6 & 31)) & 1) << 6;
dataRead |= (((pin_d7 > 31 ? GPIOHigh : GPIOLow) >> (pin_d7 & 31)) & 1) << 7;
return dataRead;
}
In practice, I would almost certainly encapsulate the individual lines as macros or as an inline function but it's probably clear enough what it's doing. Reads from both ports are kept in there - if a single hardware register read is really killing you then it could be eliminated.
The delay aspect would still concern me. This is presumably reading from an external device (LCD?) and there will be a defined timing between initiating a read and the data being available. You will get away with your approach on a specific hardware platform but in future it could easily fall down if the clock speed changes or the external delays change. Dropping to assembler shouldn't be necessary - but your code shouldn't depend on how long it takes to do a dummy read.
As an aside, the original code has instances of this:
if(pin_d1>31) {
b |= (((pins_h>>((pin_d1-32)&31))&1)<<1);
} else if(pin_d1>-1) {
b |= (((pins_l>>(pin_d1))&1)<<1);
}
Subtracting 32 from the pin value before masking serves no purpose - you're only using the lower bits anyway.
If you ensure your pin values are non-negative (which could / should be possible at compile time) then it is also redundant to test against -1.
|
|
|
|
|
Message Closed
modified 13-Dec-21 21:05pm.
|
|
|
|
|
Any particular reason you posted this in The Weird and the Wonderful?
What do you get when you cross a joke with a rhetorical question?
The metaphorical solid rear-end expulsions have impacted the metaphorical motorized bladed rotating air movement mechanism.
Do questions with multiple question marks annoy you???
|
|
|
|
|
Yeah I just deleted it. I accidently landed here instead of the lounge and I wasn't paying close enough attention apparently.
Real programmers use butterflies
|
|
|
|
|