SMAZ - compression for very small strings

Spor7biker · July 26, 2020, 3:01am

Thoughts?

Another, although more complicated option is

geeksville · July 26, 2020, 3:17am

hmm - I think adding compression is a great idea (and using a bit in our existing flags header field to say ‘this message is compressed’). But I think we’d need to use a generalized compression library. Because a fair portion of our messages are framing and other data (i.e. positions) are not text (each message is sent as a Protocol Buffer).

Also - as time moves on, there will be other payloads (for API consumers that aren’t a text messaging app). So I think plopping in a standard compression lib and just compressing the entire packet we send is probably a) a pretty good win and b) fairly straightforward.

Spor7biker · July 26, 2020, 3:24am

If the framing and other data can be compressed much I’m guessing there is redundant data that should be minimized.

One of the advantages to SMAZ is the decompression dictionary is not included in each payload.

General compression programs can make small payloads larger due to the need of including the dictionary.

There are lots of option for reducing the number of bits for locations as well.

Spor7biker · July 26, 2020, 3:53am

Maybe we could adapt this to work on the entire message with a customized Compression Model

https://ed-von-schleck.github.io/shoco/

geeksville · July 26, 2020, 4:06am

I think also, it might be worth experimenting with something simple like huffman/deflate but with a static dictionary constructed at compile time based on a corpus of a few hundred messages. That dictionary/tree would probably give a fairly optimal encoding of the messages that we actually send. Protobufs do a good job of saving space (for small values) but I bet most of that savings is eaten up by their small amount of framing.

Spor7biker · July 26, 2020, 4:16am

That is much more in line with what I was thinking.

Ride33Comfy · July 26, 2020, 9:33pm

http/2 has support for compression of header communications. This a well established precedent.
https://httpwg.org/specs/rfc7540.html#HeaderBlock

kokroo · December 7, 2021, 10:49am

I am working on this. This is my rudimentary idea:
Fork the meshtastic app, and before any message is sent, compress it and add a special character at the beginning of the message to indicate it’s compressed. The only problem is, the meshtastic oled display will display the compressed version since I don’t know how to modify the firmware and I don’t want to mess with it. I will just decompress it on the app itself.

This is a very simple thing to implement in practice, I will report back with results once I get my hands on 2 meshtastic boards.

Spor7biker · December 7, 2021, 5:30pm

Nice.

I’m very interested in what kind of results you can achieve. Doing this in the Android app isn’t ideal but maybe it will be the initial step to move it along.

I think there is a guide for setting up a dev environment for building the device firmware. From what I’ve seen the code base is really well organized. I don’t think you you’d need to concern yourself with device specifics, just dig into the message pay load functions.

linagee · December 30, 2021, 3:55pm

I think Unishox shows more promise than shoco / smaz / etc. And it already runs on the ESP32!

It has these benefits:

Higher compression than shoco
“Unlike smaz and shoco, we assume no a priori knowledge about the input text. However we rely on a posteriori knowledge about the research carried out on the language and common patterns of sentence formation and come out with pre-assigned codes for each letter.”
Unicode/UTF-8 compatible. (Will users send emojis / foreign character sets with their app keyboards?)

linagee · December 30, 2021, 3:58pm

Unishox (aka Shox96) paper: https://vixra.org/pdf/1908.0403v1.pdf

Arduino implementation:

kokroo · December 30, 2021, 8:15pm

Hey, thanks for sharing this. Definitely looks interesting.

@geeksville Do the meshtastic boards have a minimum packet size? If we are going to compress strings and find out way later that there is a minimum payload size and padding was used, then it will prove quite futile to implement this

mc-hamster · December 30, 2021, 10:16pm

I’m not @Geeksville, but I can play him on tv.

There’s no minimum packet size and no padding. We transmit as little data as possible.

If compression is implemented, it should go lower in the stack than just messages. It can be applied to the entire payload (excluding the header) just before the payload is encrypted.

On compression, the size of the compressed result is compared against the uncompressed and then a decision is made which one to transmit.

If we encrypt the entire payload rather than just the contents of the text message, all future applications will be able to take advantage of this. Heck, even our IP Tunneling will be able to use it.

kokroo · December 30, 2021, 11:09pm

We will be able to reduce airtime as well as power consumption. If we use LongSlow, we will get longer range but a bit faster transmission time.

linagee · January 5, 2022, 12:44am

If compression is implemented, it should go lower in the stack than just messages. It can be applied to the entire payload (excluding the header) just before the payload is encrypted.

One thing to keep in mind (with Unishox anyway). It says the binary compression is worse than no compression at all. So low level, yes. But only at a level where text strings appear. I do like the idea of an option bit specifying whether the text is compressed or legacy uncompressed. (Legacy uncompressed would work well if other file-like things are ever pushed through the mesh.)

mc-hamster · January 5, 2022, 12:49am

I think all packets should be compressed but utilized if the packet ends up being smaller. That increases the opportunity for compression and removes any assumptions we have.

FYI - the targz library is now being used in the device code. We can use that for compression.

kokroo · January 5, 2022, 1:20am

What exactly is targz being used for?

mc-hamster · January 5, 2022, 1:34am

Added this last night:

github.com/meshtastic/Meshtastic-device

Initial checkin of Online OTA SPIFFS update

meshtastic:master ← mc-hamster:SPIFFS_UPDATE

opened 04:06AM - 03 Jan 22 UTC

mc-hamster

+141 -10

## Thank you for sending in a pull request, here's some tips to get started! …(Please delete all these tips and replace with your text) - Before starting on some new big chunk of code, it it is optional but highly recommended to open an issue first to say "hey, I think this idea X should be implemented and I'm starting work on it. My general plan is Y, any feedback is appreciated." This will allow other devs to potentially save you time by not accidentially duplicating work etc... - Please do not check in files that don't have real changes - Please do not reformat lines that you didn't have to change the code on - We recommend using the [Visual Studio Code](https://platformio.org/install/ide?install=vscode) editor and the 'clang-format' extension, because automatically follows our indentation rules and it's auto reformatting will not cause spurious changes to lines. - If your PR fixes a bug, mention "fixes #bugnum" somewhere in your pull request description. - If your other co-developers have comments on your PR please tweak as needed.

If you have a device that doesn’t have the web files or need the web files updated, click a button and it’ll be setup in about 10 seconds.

Not totally stable right now, I labeled it as experimental.

Topic		Replies	Views
Proof of Concept: Message Persisting	0	377	June 1, 2021
Request for some code architecture guidance: sending messages Development	7	568	June 27, 2020
What is the payload in `Data` protobuf? Development	1	271	July 19, 2023
Can i change the POSITION_APP or TEXT_MESSAGE_APP json? Support	0	252	September 2, 2021
Question about private messages with multiple channels Support	5	338	March 22, 2024

SMAZ - compression for very small strings

Related Topics