Click here to Skip to main content
14,579,188 members
Rate this:
Please Sign up or sign in to vote.
See more:
I am working on data and trying to make a prediction on it using supervised learning.
Problem Statement: Let's consider we have data of routers as given below with probable values.

Features
--------
- Name (Unique)
- Brand : Cisco, Huawei, Netgear
- Location: Bedroom(B), Kitchen(K), Hall(H)
- IP address : Unique
- MacAddress : Unique
- Serial Number : Unique
- Firmware version: Varying like 1.0.0, 2.0.1, 3.1.0 etc
- state: running, discovering, rebooting

Output:
-------
Strength: Strong/Weak: 1/0

Rules governing 'Strength' output is given below.
- Brand == Cisco ==> Strength == Strong in all locations. B + H + K
- Brand == Huawei ==> Strength == Strong in Hall and kitchen only. H + K
- Brand == Netgear ==> Strength == Strong in kitchen only. K

We consider Brand and Location only for predicting Signal Strength.

Sample Train Data
=================
|Brand |Location|Strength|
---------------------------
|Cisco |Bedroom | Strong |
|Huawei |Bedroom | Weak |
|Netgear|Hall | Weak |
|Cisco |Kitchen | Strong |
|Huawei |Hall | Strong |
|Netgear|Kitchen | Strong |

Sample Test Data
=================
|Brand |Location|Strength|
---------------------------
|Cisco |Hall | Strong |
|Huawei |Kitchen | Strong |
|Netgear|Bedroom | Weak |

Questions:
1- Can this problem statement solved using machine learning or machine learning is an overkill?
2- What algorithm/architecture to be used to solve such a problem? Is normal neural network enough or CNN is more appropriate for this problem consider scaling in future?
3- Can we include any other feature for better accuracy?
4. How much data is enough to start with?


Kindly share your suggestions.

Thanks in advance!!!

What I have tried:

Question is about problem analysis and its validation.
Tried with SGD and it seems to work.
Posted
Updated 25-Mar-19 19:11pm
v2
Rate this:
Please Sign up or sign in to vote.

Solution 2

Praveen, RF Propagation tool apps have been created, refined for multipath in building materials to 8 decimal places. The tools are used, professionally, to create in-building Cellular and WiFi signal level models based on wireless router placement, antenna selection, wall, floor, ceiling construction materials, etc. to the finest degree. With a design model complete, equipment installation is performed and then verified by walk testing the space. Results are usually within 2 to 3 dB of predictions.

Your approach to collecting data results on various wireless routers and without using a refined signal level readout leads to an endless quagmire of a futile attempt to correlate measurement data in an uncontrolled, multipath environment. You will not get the same answer twice for multiple identical measurements! You did not state whether you are using 2.4GHz or 5GHz bands, however the placement of your antenna(s) need to be exactly identical. At 2.4GHz, 1.2 inch difference changes the data entirely. At 5 GHz, 0.5 to 0.6 inch relocation difference is an entirely different test setup in a multipath environment. Also, the human body is a large bag of saline which absorbs RF signal energy and multipath which affect the test measurement results.

Rather than be mislead by signal level, look at the over the air connect rate the the router > laptop client are using. You need to be passing data and look at the data frames only. Use wireless card and Ethereal or some other over the air equipment. AirMagnet is good and expensive.

If you don't understand RF propagation, you should stop this effort and study propagation characteristics first.
   
Comments
Praveen Raghuvanshi 3-May-19 23:17pm
   
Thank you Hoota for analyzing the problem and suggesting the solution. I am new to the field of Machine Learning and wanted to apply it my domain for making the application smarter. The problem is not about the router's but devices connected on a network within a venue. I could provide some more details to the actual system.

Consider a venue where you have lot of devices. Each device has parameters such as Name, IP address, Mac Address, Manufacturer, Model, Location etc. A system integrator will be installing these devices over the venue and assigning parameters such as Name, IP address, Location. While setting things up, he/she may develop an intuition of stable areas(location) and start installing devices to such places instead of deploying it to less stable area. Stability could be determined through network strength, environment conditions(Weather) etc. I am just thinking of prediting these stable areas through Machine learning that will help system integrator reduce time in determing stable area and therby improving efficiency.

Is this problem statement valid ?
Can we solve it through Machine learning?
Are we missing on parameters/need more parameters?

Thanks.
Rate this:
Please Sign up or sign in to vote.

Solution 1

The training data has no relation to the "test data" (location); no meaningful "prediction" is possible.

The size of the training data is also "too small" to be taken seriously.
   
Comments
Praveen Raghuvanshi 26-Mar-19 5:25am
   
Thanks Gerry for response. Can you elaborate on 'no relation to test data(location)'? Also, can we use some other feature in order to make it more meaningful? An anology/link would be great.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100