# compressing Xilinx bitstreams

J

#### John Larkin

Jan 1, 1970
0
Forgive me if this has been asked before, but does anybody have
Xilinx configuration bitstreams? I've been perusing a few of my .rbt
files, and they have long bunches of 1s and 0s (interestingly,
different designs seem to have more 1s, others mostly 0s.) I'd think
that something very simple might achieve pretty serious (as, maybe
2:1-ish) compression without a lot of runtime complexity. We generally
run a uP from EPROM, with the uP code and the packed Xilinx config
stuff in the same eprom, with the uP bit-banging the Xilinx FPGA at
powerup time. So a simple decompressor would be nice.

I did google for this... haven't found much.

Thanks,

John

C

#### Clark Pope

Jan 1, 1970
0
The bit generation tool has an option to compress the .bit file. I use this
when I'm loading over JTAG to save time. I assume Xilinx has info on in
system programming with a compressed .bit file.

However, I've observed the same phenomenon as you: when I zip a .bit file it
is usually less than 50% of the original size. My guess is even a trivial
run length encoding compression would be helpful. There are plenty of
resources for Lempel Ziv compression on the web:

see http://www.dogma.net/markn/articles/lzw/lzw.htm

If you get it working please post/send the result.

A

#### Austin Lesea

Jan 1, 1970
0
John,

I think that I had heard that zipping, and unzipping bit files led to
the most compression (2:1 or better). (classic unix or windows zip/unzip)

I think that a zip/unzip routine would be a great example of something a
uP could do without an unreasonable amount of memory (ROM+RAM) support.

Austin

J

#### John_H

Jan 1, 1970
0
First, please be aware that the ACSII .rbt file is 8x the simple .bin file
size. Check the bitgen options and you'll find the ability to generate the
straight binary file - 1s and 0s at the bit level, not the ASCII character
level. Compression beyond that may be what you're looking for, but please -

S

#### Steve Casselman

Jan 1, 1970
0
John Larkin said:
Forgive me if this has been asked before, but does anybody have
Xilinx configuration bitstreams? I've been perusing a few of my .rbt
files, and they have long bunches of 1s and 0s (interestingly,
different designs seem to have more 1s, others mostly 0s.) I'd think
that something very simple might achieve pretty serious (as, maybe
2:1-ish) compression without a lot of runtime complexity. We generally
run a uP from EPROM, with the uP code and the packed Xilinx config
stuff in the same eprom, with the uP bit-banging the Xilinx FPGA at
powerup time. So a simple decompressor would be nice.

I did google for this... haven't found much.

Thanks,

John

VCC did a package called HOTMan that does compression. It takes the bit file
and turns it into a compressed file that looks like...

int testArray[2669]=\
{
0xddedda78,0xe55c8c5f,0xefe1c079.... }

We get at least 4 to 1 and small designs in big chip can get 50 to 1. The
above format allows you to compile the design into a C/C++ program.

Steve

T

#### Tim Wescott

Jan 1, 1970
0
John said:
Forgive me if this has been asked before, but does anybody have
Xilinx configuration bitstreams? I've been perusing a few of my .rbt
files, and they have long bunches of 1s and 0s (interestingly,
different designs seem to have more 1s, others mostly 0s.) I'd think
that something very simple might achieve pretty serious (as, maybe
2:1-ish) compression without a lot of runtime complexity. We generally
run a uP from EPROM, with the uP code and the packed Xilinx config
stuff in the same eprom, with the uP bit-banging the Xilinx FPGA at
powerup time. So a simple decompressor would be nice.

I did google for this... haven't found much.

Thanks,

John

No links, but have you considered simple run-length limiting? I can
think of at least one scheme that would be guaranteed sub-optimal from a
compression standpoint but that wouldn't take much code -- just encode
any string of 0xff or 0x00 bytes as that byte followed by a count -- so
that 0x00 0x00 0x00 0x00 becomes 0x00 0x04, for instance. You have the
overhead that 0x00 becomes 0x00 0x01, and you also can't encode anything
that spans bytes -- but you may be happy with it none the less.

J

#### John Larkin

Jan 1, 1970
0
First, please be aware that the ACSII .rbt file is 8x the simple .bin file
size. Check the bitgen options and you'll find the ability to generate the
straight binary file - 1s and 0s at the bit level, not the ASCII character
level. Compression beyond that may be what you're looking for, but please -

Of course. We have a little utility, vaguely like a linker, that
gobbles up Motorola .s28 files and Xilinx .rbt files and builds a rom
image, all properly squashed into bits. It's cute... it even saves the
beginning of the rbt ASCII header in the rom image for FPGA version
verification. My observation was that the bits themselves include long
runs of 1s or 0s.

I'd like to design a board using a 28-pin eprom (space is at a premium
here) but plan hooks for using a bigger Xilinx chip some day, and then
I'd run out of rom space to store the config bits. So having a
compression scheme would give us the margin to use the small eprom.

Suppose the compressed data were an array of bytes. If the MS bit of a
byte were 0, the remaining 7 bits are to be loaded verbatum; if the MS
bit is a 1, the other 7 bits specify a run of up to 63 1's or 0's.

Something like that; the exact numbers may need tuning. Very easy to
unpack, not hard to encode. I'd have to test some actual config files
to see how good something like this could compress.

John

G

#### Greg Neff

Jan 1, 1970
0
Forgive me if this has been asked before, but does anybody have
Xilinx configuration bitstreams? I've been perusing a few of my .rbt
files, and they have long bunches of 1s and 0s (interestingly,
different designs seem to have more 1s, others mostly 0s.) I'd think
that something very simple might achieve pretty serious (as, maybe
2:1-ish) compression without a lot of runtime complexity. We generally
run a uP from EPROM, with the uP code and the packed Xilinx config
stuff in the same eprom, with the uP bit-banging the Xilinx FPGA at
powerup time. So a simple decompressor would be nice.

I did google for this... haven't found much.

Thanks,

John

See:

www.ee.washington.edu/people/faculty/hauck/publications/runlength.PDF
www.ee.washington.edu/people/faculty/hauck/publications/runlengthTR.PDF
www.ee.washington.edu/people/faculty/hauck/publications/runlengthJ.pdf

It should be straightforward to generate some RLL compression and
decompression code. You might want to test the algorithms on a PC to
make sure that the decompressed output ends up the same as the
uncompressed input. A garbled bitstream can have the same effect as
the MC6800 HCF opcode...

================================

Greg Neff
VP Engineering
*Microsym* Computers Inc.
[email protected]

P

#### Paul Leventis $$at home$$

Jan 1, 1970
0
Hi John,
Forgive me if this has been asked before, but does anybody have
Xilinx configuration bitstreams?

(Cyclone, Stratix II) support on-the-fly decompression of the bitstream.
The Quartus software compresses the bitstream which is then programmed into
the device using pretty much any of the many methods of programming
available, and the chip's configuration controller will decompress the
bitstream that it sees. This typically achieves a 1.9-2.3:1 compression
ratio, depending on the device utilization, RAM contents and such.

Some of our programming devices also can decompress bitstreams on-the-fly,
allowing bitstream compression for other chip families that do not support
decompression internally.

See the Configuration Handbook Volume 2
(http://www.altera.com/literature/hb/cfg/cfg_volume2.pdf) for a detailed
description of device programming and compression options.

Regards,

Paul Leventis
Altera Corp.

A

#### Allan Herriman

Jan 1, 1970
0
The bit generation tool has an option to compress the .bit file. I use this
when I'm loading over JTAG to save time. I assume Xilinx has info on in
system programming with a compressed .bit file.

This 'compression' merely merges identical frames. The probability of
getting identical frames in a well utilised FPGA isn't very high, so
this doesn't result in much reduction in file size.

Some experiments I did a few years ago (on Virtex-E and Virtex-2
files) indicated that the this compression made subsequent compression
by tools such as gzip *worse*.

Regards,
Allan.

N

#### Neil Glenn Jacobson

Jan 1, 1970
0
serial PROMs support storage of compressed bitstream data. The data is
compressed when you translate to the PROM format and the PROM does the
decompression before delivery to the FPGA.

http://www.xilinx.com/bvdocs/publications/ds123.pdf

R

#### roller

Jan 1, 1970
0
John Larkin said:
Forgive me if this has been asked before, but does anybody have
Xilinx configuration bitstreams? I've been perusing a few of my .rbt
files, and they have long bunches of 1s and 0s (interestingly,
different designs seem to have more 1s, others mostly 0s.) I'd think
that something very simple might achieve pretty serious (as, maybe
2:1-ish) compression without a lot of runtime complexity. We generally
run a uP from EPROM, with the uP code and the packed Xilinx config
stuff in the same eprom, with the uP bit-banging the Xilinx FPGA at
powerup time. So a simple decompressor would be nice.

I did google for this... haven't found much.

try searching for RLE (run length encoding) that's the encoding used for
..PCX graphic files

N

#### Nico Coesel

Jan 1, 1970
0
John Larkin said:
Forgive me if this has been asked before, but does anybody have
Xilinx configuration bitstreams? I've been perusing a few of my .rbt
files, and they have long bunches of 1s and 0s (interestingly,
different designs seem to have more 1s, others mostly 0s.) I'd think
that something very simple might achieve pretty serious (as, maybe
2:1-ish) compression without a lot of runtime complexity. We generally
run a uP from EPROM, with the uP code and the packed Xilinx config
stuff in the same eprom, with the uP bit-banging the Xilinx FPGA at
powerup time. So a simple decompressor would be nice.

I did google for this... haven't found much.

Tried it but found the files aren't reduced in size much and more
important, the software required to decompress the file eats away all
the savings for a 400k device. In other words: Unless you have more
than around half a million gates of configuration data, it's not worth
it.

J

#### John Larkin

Jan 1, 1970
0
Tried it but found the files aren't reduced in size much and more
important, the software required to decompress the file eats away all
the savings for a 400k device. In other words: Unless you have more
than around half a million gates of configuration data, it's not worth
it.

OK, bear with me on this. Here's a piece of a .rbt for a Spartan XL...

01111111111111111111111111111111111111111111111111111111111011111111111111111111111111110111111110111111011111111110111111110101011101111110111111011111111111111111110011111111111111111111111111111111111111111111111111111110101
01111111111111111111111111111111111111111111111111111111111111111111111111111101111111111111111111111111110111111101111111111111111110111111111111110111111111111011111101111111111111111111111111111111111111111111111111111110011
01111111111111111111111111111111111111111111111111111100011111111111111111101111111100111111110011111111111111011101111111111100111011110011111011111111111111111111111110110111001111111111111111111111110111111011111111111111011
01111111111111111111111111111111111111111111111111111111011111111111111111101111111101011111111110011111111111111100111111111111011111111101111111111111111111110111101111111111110111111111111111111111111111111111111111111111110
01111111111111111111111111111111111111111111110111111111111111111111111111111111111111111111111011111111111111111011010111111110011111111011111111111011111011111011110101111111000111111111011111111111101111111111111110101101111
00111111111111111111111111111111111111111111000111111111111111111111111111111111111111111111111111111111110111111111110110111111011111111111111111111101111111111111111101111111110111111100011111111111111111111111111101101100000
01111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110111111111101110111101011111111111111111111111111111111111111101111111101111111111111111111111111111111111111110111111100
00011111111111111111111111111111111111111111111111111111111100111111011111111111001111110110101111001111111101111111111111001111111100111111111001111101101011110110011111101010111101111111111111111111111010111100111111111111000
01101011111111111111111111111111111111111111111111111111111110101111111111111111101011111110011110111111110101001110111111101011011100111111111010010111001111110110101101111111111111111111111111111111110011111101111111010100111
01111011111111111111111111111111111111111111111111111111111111111111011111111111111111110111111111111111110110111111111111101011011111111111111111111101111111111111111101111111111101111111111111111111111111111111111111011111010
01101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110011111111111111111111111111111111111111111001111111111111111110111111111111111111111111111111111111111111111111100111110011000
01111111111111111111111111111111111111111111111111111111111011111111101111111110111111111011111111101111011110111111111110111101111001101111101111111110101011111011010111101111111110111111101111111111111111111110111111110100011
01101111111111111111111111111111111111111111111011111111111111111111111111111111111111111111111111111111111101011010111111111111110111101111111111111101011011111111111111011110111111111111111111111111111111111111111111111110101
00111100111111111111111111111111111111111111111011111111110100111111100011111101001111111000111111111111111110101011111101101011110010011111011011111111101011110110101111010001111110111111111111111111111111111101111100111110111

Where there are lots of 1's. Other hunks of this file are almost all
1's. So what we need is a not-very-general compression scheme, with
the only "dictionary" entry being "the following is a hunk of 1's". So
the decompressor could be very simple.

Interestingly, this is for a Spartan 2:

00000000000001001000000000000000
00000000000000000000000000000000
00000000000100100000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000001001000000000000000
00000000000000000000000000000000
00000000000100100100000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000001001100000000000000
00000000000000000000000000000000
11111111000100110000000100000100
00000000010001000000000000010000
00000000000001110100100000000000
11010100000000000011010000000000
00000001000000000000000000001000
00111111110001000000000000000000

Which has long runs of zeroes!

Just eyeballing these files, it looks like something very simple could
get at least a 2:1 squash factor.

John

N

#### Nico Coesel

Jan 1, 1970
0
John Larkin said:
OK, bear with me on this. Here's a piece of a .rbt for a Spartan XL...

00111111110001000000000000000000

Which has long runs of zeroes!

Just eyeballing these files, it looks like something very simple could
get at least a 2:1 squash factor.

Did you ever try to compress these files? I totally agree with you
that these files _look_ easy to compress, but they aren't. I tried
RLE, but that will only save 5% to 10%. ZIP does a little better. I
just tried to compress a .bit file for a 400k gate Xilinx device and
it reduces the size by 26% but you'll need to have room for the ZIP
decompression code...

T

#### Tim

Jan 1, 1970
0
Nico said:
Did you ever try to compress these files? I totally agree with you
that these files _look_ easy to compress, but they aren't.

But with a little knowledge of the structure maybe we can do
better than blind RLE or whatever. Surely any structure
which the eye can see can be efficiently encoded?

e.g. "There will be lots of repeats for unused LUTs.
These are coded as abc and should be decoded as xyz"

N

#### Nico Coesel

Jan 1, 1970
0
Tim said:
But with a little knowledge of the structure maybe we can do
better than blind RLE or whatever. Surely any structure
which the eye can see can be efficiently encoded?

Another poster claims huge space savings by using a special tool. I
haven't looked into it.
e.g. "There will be lots of repeats for unused LUTs.
These are coded as abc and should be decoded as xyz"

That's the problem: the routing software smears the entire design over
the entire FPGA if it can. You can specify to leave unused space from
the bit-file, but you'll see the length varies with every routing run.
Perhaps the best space saver is to constrain the router to use only a
part of the FPGA which just is big enough to contain your design. Next
specify to leave out the unused stuff.

Z

#### Zak

Jan 1, 1970
0
Nico said:
Did you ever try to compress these files? I totally agree with you
that these files _look_ easy to compress, but they aren't. I tried
RLE, but that will only save 5% to 10%.

Probably because the looks for repeating bytes, while here we have only
repeating stretches of 0's. What might work is to re-code the file into
numbers giving the number of 0 bits between 1's as a first step:

00100000101000000000010000011000000000001 would turn into
2 - 5 - 1 - 10 - 5 - 0 - 11.

Stretches of 0 more than 254 long could be encoded as 255, meaning 255
zeroes and no 1, whith the next number to give more 0's. 1-[255 0s]-1
would code to 255 0 in that case.

The resulting bytes are probably easier to huffman compress. Or it may
pay to do this for 0 runs up to 16 long, and coding these as bytes with
values 0-15 (not as nibble pairs, subsequent nibbles probably do not
have any relationship).

Thomas

J

#### John Larkin

Jan 1, 1970
0
Did you ever try to compress these files? I totally agree with you
that these files _look_ easy to compress, but they aren't. I tried
RLE, but that will only save 5% to 10%. ZIP does a little better. I
just tried to compress a .bit file for a 400k gate Xilinx device and
it reduces the size by 26% but you'll need to have room for the ZIP
decompression code...

I tried my simple run-encoder. On various designs I have around, it
achieved compression ratios of (best) 0.56 and worst 1.04 (ie,
compressed was bigger than uncompresssed!) The worst was on a fairly
dense XC2S400 bga part, whose rbt file had hardly any long runs of
anything. Even pkzip only managed to crunch the binary config image to
0.74 on this one. It looks to me that the newer Xilinx chip files tend
to be less compressible... seem to have fewer runs. So maybe there's
no very-simple-to-unpack thing that's generally useful.

Needs more thought someday, I guess.

John

N

#### Nico Coesel

Jan 1, 1970
0
Zak said:
Nico said:
Did you ever try to compress these files? I totally agree with you
that these files _look_ easy to compress, but they aren't. I tried
RLE, but that will only save 5% to 10%.

Probably because the looks for repeating bytes, while here we have only
repeating stretches of 0's. What might work is to re-code the file into
numbers giving the number of 0 bits between 1's as a first step:

00100000101000000000010000011000000000001 would turn into
2 - 5 - 1 - 10 - 5 - 0 - 11.

Stretches of 0 more than 254 long could be encoded as 255, meaning 255
zeroes and no 1, whith the next number to give more 0's. 1-[255 0s]-1
would code to 255 0 in that case.

The resulting bytes are probably easier to huffman compress. Or it may
pay to do this for 0 runs up to 16 long, and coding these as bytes with
values 0-15 (not as nibble pairs, subsequent nibbles probably do not
have any relationship).

This makes sense. Haven't tried is though. I presume(d) ZIP looks at
the bits instead of the bytes. Still, don't feel lucky because you
seen a lot of contiguous '1's and '0's.

Here is a wild idea:
Another way of compressing the file may be by stripping the frame
headers (which are repeated at the start of each frame, these can
easely be added during decompression) and sorting the resulting data.
Next step is compressing it, but not by going from left to right, but
going from top to bottom and compress column after column. Because of
the sorting, least changes from 0 to 1 are to be expected in a column.
Decompressing however would require a fair amount of memory, so the
data also has to be divided in blocks so only a block at a time needs
to be decompressed. IIRC it doesn't matter in which order the data
frames are loaded as long as the command frames are at the right
place.

Xilinx has some thorough information on their programming datastream
on their website.

A
Replies
2
Views
729
Jim Logajan
J
B
Replies
30
Views
2K
Fred Abse
F
Replies
0
Views
751