Recommendations for balancing speed and safety against malicious data

Question

0.00/5 (No votes)

See more:

Hello,

I'm writing a communications protocol between a game client and server application. I've written some methods to construct a packet from a received string.

std::map<int, PacketType> packetRegistry:
PacketPlayerMove has ID 0
PacketPlayerChat has ID 1

So if 0,ab12cd34ef56ab78,100,0,250 is received from the client, it's packet ID 0 and therefore a PacketPlayerMove. The remaining arguments are passed to the constructor as a std::vector<string>. This ends up producing a PacketPlayerMove with player ID ab12-cd34-ef56-ab78 at location (100, 0, 250) in 3D space, and this gets broadcast to the other players to move the corresponding player model in the world.

However, naturally I can't trust the received string to not be malicious. I've already implemented a check to ensure the packet ID is in the packetRegistry, but I'm not sure whether I should bother to check the parameter count, lengths, data types, perhaps regex for format, etc, in the event something nasty is received such as a huge string which could crash the server. How would you recommend I deal with this?

What I have tried:

------------------------------------------------------------

Posted 10-Jul-18 0:32am

User 13204940

Updated 10-Jul-18 1:00am

v2

Add a Solution

Comments

KarstenK 10-Jul-18 8:01am

"Safety first" is always the first consideration ;-)

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Jochen Arndt · Answer 1 · 2018-07-10T01:00:00

Solution 1

Checking the data is necessary to avoid undefined behaviour or even letting your application crash. I guess that you are using UDP because you are broadcasting. Then data may be inconsistent (rare but may happen).

I suggest to not use strings but raw data instead. That avoids conversions to and from string (speed) and makes the packets usually smaller. You can also provide a checksum to drop invalid packets.

Such a packet might for example look like

C++

typedef struct {
    uint32_t type; // Fixed packet type identifier
    uint32_t length; // Total length of packet
    uint32_t checksum; // Checksum over all following data
    uint32_t version; // Version: structure can be extended
    //uint32_t flags; // Optional flags defining which data are present
    uint32_t id;
    uint32_t loc_x;
    uint32_t loc_y;
    uint32_t loc_z;
    uint32_t name_len;
    uint8_t name[1]; // Placeholder for name
} net_packet_t;

If you have to pass multiple strings (or any other data with variable length), add for each an offset value beside the length value. Then you can use that to locate the next field.

Posted 10-Jul-18 1:00am

Jochen Arndt

Comments

[no name] 10-Jul-18 7:18am

My issue with that is serialising it and converting it back to the types specified in the struct. Surely that doesn't avert the string conversion issue either.

Jochen Arndt 10-Jul-18 7:38am

I don't get you. Use the same variable types for the members as in your code. Then there is only the string which can be simply copied. Serialising that structure is quite simple and avoids any conversions from numbers to strings and vice versa (you will notice the benefits once you have to deal with floating point or datetime data).

Regarding the string:
I missed that it looks like a series of hex codes. Then use a byte (uint8_t) array if such is used by your code or pass it as described if already present as string. In any case you can still apply further (validity) checks on it.

It is just about data and it's integrity. My suggestion is to use raw data which makes handling and checking simpler (and faster: you asked for it). The checksum should catch all cases of integrity violation besides specifically generated malicious packets which require knowledge of the data structure and the checksum algorithm.

Rick York 10-Jul-18 12:28pm

I agree. I always transmit raw data in its binary form. Then there are no conversions and bounds check is just standard stuff from the rest of the application. "Serializing" then becomes a memcpy.

[no name] 12-Jul-18 3:45am

I'm using a class instance for each packet type with member variables to hold the data. The class has a serialize() method which puts all of the data together in a CSV string, with the packet ID as the first value.

Jochen Arndt 12-Jul-18 4:21am

You asked for speed and safety.

Using raw (binary) data satifies both better than a CSV string.
So either change your serialisation to raw data or keep your string accepting the disadvantages.

If you already have a class with members that should be part of the package, you can even put these into a structure which can be then a class member as well as part of the packet.

KarstenK 10-Jul-18 11:46am

I am actually on some UDP traffic stuff and collisions are common. UDP is "the collision relevant" protocol :-O