I’m working on a community desktop client, and we’re interested in allowing users to algorithmically manage their Meshtastic networks. We’re interested in running algorithms that can warn of network bottlenecks, detect potential network faults, and ensure connection to all nodes is reliable.
One of the steps we see towards that goal is to be able to receive information on the current state of the mesh. To this end, I’ve written up a proposal (five pages, ~8 min read) to collect the required network information.
I’ve attached a hosted link to the PDF on my personal site. I figured I’d get thoughts on this before the meet and greet this Sunday. Would love to hear what people think! I’ve also attached a brief summary of the proposal below.
We are looking to introduce a new mesh packet that would allow for advanced network analysis to be performed on Meshtastic networks. This packet would contain information on the connection each node has to other nodes in the network. Some key takeaways:
This functionality would be disabled by default, and would require a user to manually enable the transmission of this type of packet.
Assuming a constant network density d, the payload size of such packets would increase on the order of O(dn), not on the order of O(n^2).
There is the potential for large routing improvements within networks if the routing protocol utilizes this network state data.
This is a proposal, and as such all feedback is welcome!
First of all, I must say I am impressed with the work you did! The desktop client looks very promising.
Though, to be honest, I don’t see how your proposed packet would give a lot of additional valuable information. As you said, we can already get SNR information of the nodes a device is directly connected to and the RX time can already be derived across hops. Indeed your approach would allow you to draw the full graph (assuming this data actually arrives).
Note
Also consider that the maximum payload of a packet is around 251 bytes, so you would likely need to split your network state packet up. Moreover, to let it be spread over the full network, due to flooding there might be n-1 duplications of this packet. So while the payload size only increases on the order of O(dn), the overall overhead would increase with O(dn^2) in the worst case.
But, apart from changing the hop limit or (re)placing (additional) nodes, what can we do with this to make the network more reliable?
I have been thinking on an automatic hop limit setting, which would be sufficient in order to let a packet arrive at each destination, but not create additional rebroadcasting. This can already be derived using the Traceroute module, or it can be achieved by adding the current hop limit setting to the DeviceTelemetry or NodeInfo packets, such that you can derive how many hops it needed to arrived at you.
Other more sophisticated routing algorithms (e.g. Distance Vector Routing) would likely need breaking changes, not only network state data. For example, instead of including only the original sender and destination in the header, also add the next-hop receiver and current transmitter of a packet, such that a specific route can be set up.
Did you by any chance stumble upon the simulator? It allows for network analysis of a scenario by use of simulation, either using discrete events to mimic large networks in a relatively short time, or by using the Meshtastic Linux native application (can also be used on Windows and Mac using Docker), i.e., the real firmware with emulated LoRa radio. The benefit of such an approach is that you can have an oracle view of the network without any overhead, and see what happens if you add another node somewhere or change the hop limit of a node. But, as you can see, I am not a UI guy and I would actually love to see it integrated with a desktop client.
Thanks for the feedback! Sorry for a bit of delay, I wanted to run your thoughts by the rest of the team.
To clarify, from your comment on making the network more reliable, our intended goal of the packet isn’t necessarily to make the network more reliable in a routing sense. We’re not envisioning the primary purpose of this packet being routing improvements, although it was called out in the proposal in the hopes that this could be a positive side-effect of the changes. The goal that we’re looking to hit is allowing higher user confidence in the network, as well as giving users more insight into the network topology. We expect that, by giving users a deeper view into how the network is connected, that the user of our client could take an active role in repositioning network nodes to optimize performance and packet transmission reliability.
On the network simulator comment, thanks for the heads-up! I hadn’t run this change through the simulator before posting the proposal. I didn’t have any luck spinning up the simulator on my Windows (11, running in docker) or Linux (Arch, running native) machines to test this type of change, although that’s probably outside the scope of this thread. If I can’t get it working within the next day or two I’ll follow up in GitHub or Discord.
Also, thanks for pointing out that packet flooding would lead to O(dn^2) performance and not O(dn). That’s something I had missed in the proposal PDF
On the TraceRoute module, that’s also something I wasn’t familiar with. I’m curious though whether we would be able to efficiently and reliably build out a graph of network connections using the TraceRoute module.It seems like this packet is intended to debug paths to single other nodes, and not necessarily build out all network connections. This being said I haven’t dug very deeply into the implementation of the TraceRoute packet, so let me know if I’m missing something on that front
That makes sense then and since you propose to let a user opt-in, there wouldn’t be any overhead by default. I think the TraceRoute module can also really help with giving more insight. However, you’re right, per packet it shows the path to one destination node only (although you can derive the path of intermediate nodes as well then). Actually this module is brand new and I think it is not too difficult to give it more functionality, e.g. to let node A request a TraceRoute from B to C and report that back to A, such that you can visualize it.
I ran it on a native Windows machine with the Linux applications in a light-weight Docker container that is build from the firmware directory. If the native application does not work on Linux Arch, you can also use Docker on that, I tested this on Ubuntu. But, indeed feel free to ask me questions if you don’t get it working.
@ajmcquilkin I can’t really speak to your intended goals with your project, it definitely sounds interesting and I’ll be sure to follow but I did want to compliment your client project you are working on. I came across your GitHub and gave it a browse, and so far it looks pretty nice. I look forward to seeing more updates. Good luck!
Interesting, I’ll take a look into how the TraceRoute module functions and what it would entail to expand it to our use case. The one thought I would have is that it would require triggering on the part of the client, and if there were to be multiple clients on the network implementing this graph-building functionality I can see this overwhelming most networks.
I also just got the client working on native Linux. The issue I was having was related to gnome-terminal being the default terminal on the repo, which I solved per your issue comment. I’ll work on analyzing the impact of this packet in the simulator over the next few days.