1
2
3 __version__ = '1.3'
4 __author__ = "Avinash Kak (kak@purdue.edu)"
5 __date__ = '2006-Dec-26'
6 __url__ = 'http://RVL4.ecn.purdue.edu/~kak/dist/BitVector-1.3.html'
7 __copyright__ = "(C) 2006 Avinash Kak. GNU GPL 2."
8
9 __doc__ = '''
10
11 BitVector.py
12
13 Version: ''' + __version__ + '''
14
15 Author: Avinash Kak (kak@purdue.edu)
16
17 Date: ''' + __date__ + '''
18
19
20 CHANGE LOG:
21
22 Version 1.3:
23
24 (a) One more constructor mode included: When initializing a
25 new bit vector with an integer value, you can now also
26 specify a size for the bit vector. The constructor zero-pads
27 the bit vector from the left with zeros. (b) The BitVector
28 class now supports 'if x in y' syntax to test if the bit
29 pattern 'x' is contained in the bit pattern 'y'. (c) Improved
30 syntax to conform to well-established Python idioms. (d) What
31 used to be a comment before the beginning of each method
32 definition is now a docstring.
33
34 Version 1.2:
35
36 (a) One more constructor mode included: You can now construct
37 a bit vector directly from a string of 1's and 0's. (b) The
38 class now constructs a shortest possible bit vector from an
39 integer value. So the bit vector for the integer value 0 is
40 just one bit of value 0, and so on. (c) All the rich
41 comparison operators are now overloaded. (d) The class now
42 includes a new method 'intValue()' that returns the unsigned
43 integer value of a bit vector. This can also be done through
44 '__int__'. (e) The package now includes a unittest based
45 framework for testing out an installation. This is in a
46 separate directory called "TestBitVector".
47
48 Version 1.1.1:
49
50 The function that does block reads from a disk file now peeks
51 ahead at the end of each block to see if there is anything
52 remaining to be read in the file. If nothing remains, the
53 more_to_read attribute of the BitVector object is set to
54 False. This simplifies reading loops. This version also
55 allows BitVectors of size 0 to be constructed
56
57
58 Version 1.1:
59
60 I have changed the API significantly to provide more ways for
61 constructing a bit vector. As a result, it is now necessary
62 to supply a keyword argument to the constructor.
63
64
65
66 INTRODUCTION:
67
68 The BitVector class for a memory-efficient packed representation
69 of bit arrays and for logical operations on such arrays. The
70 core idea used in this Python script for bin packing is based on
71 an internet posting by Josiah Carlson to the Pyrex mailing list.
72
73 Operations supported on bit vectors:
74
75 __getitem__
76 __setitem__
77 __len__
78 __iter__
79 __contains__
80 __getslice__
81 __str__
82 __int__
83 __add__
84 __eq__, __ne__, __lt__, __le__, __gt__, __ge__
85 | for bitwise or
86 & for bitwise and
87 ^ for bitwise xor
88 ~ for bitwise inversion
89 << for circular rotation to the left
90 >> for circular rotation to the right
91 + for concatenation
92 intValue() for returning the integer value
93 divide_into_two
94 permute
95 unpermute
96 pad_from_left
97 pad_from_right
98 read_bits_from_file
99 write_to_file
100 read_bits_from_fileobject
101 write_bits_to_fileobject
102
103
104
105
106 CONSTRUCTING BIT VECTORS:
107
108
109 You can construct a bit vector in six different ways.
110
111 (1) You can construct a bit vector directly from either a tuple
112 or a list of bits, as in
113
114 bv = BitVector( bitlist = [1,0,1,0,0,1,0,1,0,0,1,0,1,0,0,1] )
115
116 (2) You can construct a bit vector from an integer by
117
118 bv = BitVector( intVal = 56789 )
119
120 The bits stored now will correspond to the binary
121 representation of the integer. The resulting bit vector is
122 the shortest possible bit vector for the integer value
123 supplied. For example, when intVal is 0, the bit vector
124 constructed will consist of just the bit 0.
125
126
127 (3) When initializing a bit vector with an intVal as shown
128 above, you can also specify a size for the bit vector:
129
130 bv = BitVector( intVal = 0, size = 8 )
131
132 will return the bit vector consisting of the bit pattern
133 00000000. The zero padding needed for meeting the size
134 requirement is always on the left. If the size supplied is
135 smaller than what it takes to create the shortest possible
136 bit vector for intVal, an exception is thrown.
137
138
139 (4) You can create a zero-initialized bit vector of a given size
140 by
141
142 bv = BitVector( size = 62 )
143
144 This bit vector will hold exactly 62 bits, all initialized to
145 the 0 bit value.
146
147 (5) You can construct a bit vector from a disk file by a two-step
148 procedure. First you construct an instance of bit vector by
149
150 bv = BitVector( filename = 'somefile' )
151
152 This bit vector itself is incapable of holding the bits. To
153 now create bit vectors that actually hold the bits, you need
154 to make the following sort of a call on the above variable
155 bv:
156
157 bv1 = bv.read_bits_from_file( 64 )
158
159 bv1 will be a regular bit vector containing 64 bits from the
160 disk file. If you want to re-read a file from the beginning
161 for some reason, you must obviously first close the file
162 object that was acquired with a call to the BitVector
163 constructor with a filename argument. This can be
164 accomplished by
165
166 bv.close_file_object()
167
168 (6) You can construct a bit vector from a string of 1's and 0's
169 by
170
171 bv = BitVector( bitstring = '110011110000' )
172
173 (7) Yet another way to construct a bit vector is to read the bits
174 directly from a file-like object, as in
175
176 x = "111100001111"
177 fileobj = StringIO.StringIO( x )
178 bv = BitVector( fp = fileobj )
179
180
181
182 OPERATIONS SUPPORTED BY THE BITVECTOR CLASS:
183
184 DISPLAYING BIT VECTORS:
185
186
187 1) Since the BitVector class implements the __str__ method, a
188 bit vector can be displayed on a terminal by
189
190 print bitvec
191
192 Basically, you can always obtain the string representation
193 of a bit vector by
194
195 str( bitvec )
196
197 and integer value by
198
199 int( bitvec )
200
201
202
203 ACCESSING AND SETTING INDIVIDUAL BITS AND SLICES:
204
205
206 2) Any single bit of a bit vector bv can be set to 1 or 0 by
207
208 bv[M] = 1_or_0
209 print bv[M]
210
211 for accessing (and setting) the bit at the position that is
212 indexed M. You can retrieve the bit at position M by bv[M].
213
214 3) A slice of a bit vector obtained by
215
216 bv[i:j]
217
218 is a bit vector constructed from the bits at index positions
219 from i through j-1.
220
221 4) You can iterate over a bit vector, as illustrated by
222
223 for bit in bitvec:
224 print bit,
225
226 This is made possible by the override definition for the
227 special __iter__() method.
228
229 5) Negative subscripts for array-like indexing are supported.
230 Therefore,
231
232 bitvec[ -i ]
233
234 is legal assuming that the index range is not violated.
235
236
237
238 LOGICAL OPERATIONS ON BIT VECTORS:
239
240
241 6) Given two bit vectors bv1 and bv2, you can perform bitwise
242 logical operations on them by
243
244 result_bv = bv1 ^ bv2
245 result_bv = bv1 & bv2
246 result_bv = bv1 | bv2
247 result_bv = ~bv1
248
249
250
251 COMPARING BIT VECTORS:
252
253 7) Given two bit vectors bv1 and bv2, you can carry out the
254 following comparisons that return Boolean values:
255
256 bv1 == bv2
257 bv1 != bv2
258 bv1 < bv2
259 bv1 <= bv2
260 bv1 > bv2
261 bv1 >= bv2
262
263 The equalities and inequalities are determined by the integer
264 values associated with the bit vectors.
265
266
267
268
269 OTHER SUPPORTED OPERATIONS:
270
271
272 8) You can permute and un-permute bit vectors:
273
274 bv_permuted = bv.permute( permutation_list )
275
276 bv_unpermuted = bv.unpermute( permutation_list )
277
278
279 9) Left and right circular rotations can be carried out by
280
281 bitvec << N
282
283 bitvec >> N
284
285 for circular rotations to the left and right by N bit
286 positions.
287
288
289 10) A bit vector containing an even number of bits can be
290 divided into two equal parts by
291
292 [left_half, right_half] = bitvec.divide_into_two()
293
294 where left_half and right_half hold references to the two
295 returned bit vectors.
296
297
298 11) You can find the integer value of a bit array by
299
300 bitvec.invValue()
301
302 or by
303
304 int( bitvec )
305
306
307 12) You can convert a bit vector into its string representation
308 by
309
310 str( bitvec )
311
312
313 13) Because __add__ is supplied, you can always join two
314 bit vectors by
315
316 bitvec3 = bitvec1 + bitvec2
317
318 bitvec3 is a new bit vector that contains all the
319 bits of bitvec1 followed by all the bits of bitvec2.
320
321
322 14) You can write a bit vector directly to a file, as
323 illustrated by the following example that reads one bit
324 vector from a file and then writes it to another
325 file
326
327 bv = BitVector( filename = 'input.txt' )
328 bv1 = bv.read_bits_from_file(64)
329 print bv1
330 FILEOUT = open( 'output.txt', 'w' )
331 bv1.write_to_file( FILEOUT )
332 FILEOUT.close()
333
334 IMPORTANT: The size of bit vector must be a multiple of
335 of 8 for this write function to work. If this
336 condition is not met, the function throws an
337 exception.
338
339 15) You can also write a bit vector directly to a stream
340 object, as illustrated by
341
342 fp_write = StringIO.StringIO()
343 bitvec.write_bits_to_fileobject( fp_write )
344 print fp_write.getvalue() # 111100001111
345
346
347 16) You can pad a bit vector from the left or from the
348 right with a designated number of zeros
349
350 bitvec.pad_from_left( n )
351
352 bitvec.pad_from_right( n )
353
354 In the first case, the new bit vector will be the same
355 as the old bit vector except for the additional n zeros
356 on the left. The same thing happens in the second
357 case except that now the additional n zeros will be on
358 the right.
359
360 17) You can test if a bit vector x is contained in another bit
361 vector y by using the syntax 'if x in y'. This is made
362 possible by the override definition for the special
363 __contains__() method.
364
365
366
367 HOW THE BIT VECTORS ARE STORED:
368
369
370 The bits of a bit array are stored in 16-bit unsigned ints.
371 After resolving the argument with which the constructor is
372 called (which happens in lines (A2) through (A68) of the file
373 BitVector.py), the very first thing that the constructor does is
374 to figure out in line (A69) as to how many of those 2-byte ints
375 it needs for the bits. For example, if you wanted to store a
376 64-bit array, the variable 'two_byte_ints_needed' in line (A69)
377 would be set to 4. (This does not mean that the size of a bit
378 vector must be a multiple of 16. Any sized bit vectors can
379 constructed using the required number of two-byte ints.) Line
380 (A70) then creates an array of 2-byte ints and initializes it
381 with the required number of zeros. Lines (A71) then shifts the
382 bits into the array of two-byte ints.
383
384 As mentioned above, note that it is not necessary for the size
385 of the vector to be a multiple of 16 even though we are using
386 C's unsigned short as as a basic unit for storing the bit
387 arrays. The class BitVector keeps track of the actual number of
388 bits in the bit vector through the "size" instance attribute.
389
390 With regard to the code in lines (A2) through (A68) of the file
391 BitVector.py, note that, except for one case, the constructor
392 must be called with a single keyword argument, which determines
393 how the bit vector will be constructed. The single exception to
394 this rule is for the keyword argument 'intVal' which can be used
395 along with the 'size' keyword argument. When 'intVal' is used
396 with the 'size' option, the bit vector constructed for the
397 integer is the shortest possible bit vector. On the other hand,
398 when 'size' is also specified, the bit vector is padded with
399 zeroes from the left so that it has the specified size.
400
401 Lines (A14) through (A20) are for the following sort of a call
402
403 bv = BitVector( filename = 'myfilename' )
404
405 This call returns a bit vector on which you must subsequently
406 invoke the 'read_bits_from_file()' method to actually obtain a
407 bit vector consisting of the bits that constitute the
408 information stored in the file.
409
410 Lines (A21) through (A26) are for the case when you want to
411 construct a bit vector by reading the bits off a file-like
412 object, as in
413
414 x = "111100001111"
415 fileobj = StringIO.StringIO( x )
416 bv = BitVector( fp = fileobj )
417
418 Lines (A27) through (A52) are for the case when you want to
419 construct a bit vector from an integer, as in
420
421 bv = BitVector( intVal = 123456 )
422
423 The bits stored in the bit vector will correspond to the binary
424 representation of the integer argument provided. The bit vector
425 constructed with the above call will be the shortest possible
426 bit vector for the integer supplied. As a case in point, when
427 the intVal is 0, the bit vector will consist of a single bit
428 which will be 0 also. The code in lines (A27) through (A52) can
429 also handle the following sort of a call
430
431 bv = BitVector( intVal = 46, size = 16 )
432
433 which returns a bit vector of a specfic size by padding the
434 shortest possible bit vector the the intVal with zeros from the
435 left.
436
437 Lines (A53) through (A57) are for constructing a bit vector with
438 just the size information, as in
439
440 bv = BitVector( size = 61 )
441
442 This returns a bit vector that will hold exactly 61 bits, all
443 initialized to the zero value.
444
445 Lines (A58) through (A62) are for constructing a bit vector from
446 a bitstring, as in
447
448 bv = BitVector( bitstring = '00110011111' )
449
450 Finally, lines (A63) through (A66) are for constructing a bit
451 vector from a list or a tuple of the individual bits:
452
453 bv = BitVector( bitlist = (1, 0, 1, 1, 0, 0, 1) )
454
455 The bit vector constructed is initialized with the supplied
456 bits.
457
458
459
460 ACKNOWLEDGEMENTS:
461
462 The author is grateful to Oleg Broytmann for suggesting many
463 improvements that were incorporated in Version 1.1 of this
464 package. The author would like to thank Kurt Schwehr whose
465 email resulted in the creation of Version 1.2. Kurt also caught
466 an error in my earlier version of 'setup.py' and suggested a
467 unittest based approach to the testing of the package. Kurt
468 also supplied the Makefile that is included in this
469 distribution. The author would also like to thank all (Scott
470 Daniels, Blair Houghton, and Steven D'Aprano) for their
471 responses to my comp.lang.python query concerning how to make a
472 Python input stream peekable. This feature was included in
473 Version 1.1.1.
474
475 With regard to the changes incorporated in Version 1.3, thanks
476 are owed to Kurt Schwehr and Gabriel Ricardo for bringing to my
477 attention the bug related to the intVal method of initializing a
478 bit vector when the value of intVal exceeded sys.maxint. This
479 problem is fixed in Version 1.3. Version 1.3 also includes many
480 other improvements that make the syntax better conform to the
481 standard idioms of Python. These changes and the addition of
482 the new constructor mode (that allows a bit vector of a given
483 size to be constructed from an integer value) are also owing to
484 Kurt's suggestions.
485
486
487
488 ABOUT THE AUTHOR:
489
490 Avi Kak is the author of "Programming with Objects: A
491 Comparative Presentation of Object-Oriented Programming
492 with C++ and Java", published by John-Wiley in 2003. This
493 book presents a new approach to the combined learning of
494 two large object-oriented languages, C++ and Java. It is
495 being used as a text in a number of educational programs
496 around the world. This book has also been translated into
497 Chinese. For further information, please visit
498 www.programming-with-objects.com
499
500
501
502 SOME EXAMPLE CODE:
503
504 #!/usr/bin/env python
505 import BitVector
506
507 # Construct a bit vector from a list or tuple of bits:
508 bv = BitVector.BitVector( bitlist = (1, 0, 0, 1) )
509 print bv # 1001
510
511 # Construct a bit vector from an integer:
512 bv = BitVector.BitVector( intVal = 5678 )
513 print bv # 0001011000101110
514
515 # Construct a bit vector of a given size from a given
516 # integer:
517 bv = BitVector( intVal = 45, size = 16 )
518 print bv # 0000000000101101
519
520 # Construct a zero-initialized bit vector of a given size:
521 bv = BitVector.BitVector( size = 5 )
522 print bv # 00000
523
524 # Construct a bit vector from a bit string:
525 bv = BitVector.BitVector( bitstring = '110001' )
526 print bv[0], bv[1], bv[2], bv[3], bv[4], bv[5] # 1 1 0 0 0 1
527 print bv[-1], bv[-2], bv[-3], bv[-4], bv[-5], bv[-6] # 1 0 0 0 1 1
528
529 # Construct a bit vector from a file like object:
530 import StringIO
531 x = "111100001111"
532 fp_read = StringIO.StringIO( x )
533 bv = BitVector.BitVector( fp = fp_read )
534 print bv # 111100001111
535
536 # Experiments with bit-wise logical operations:
537 bv3 = bv1 | bv2
538 bv3 = bv1 & bv2
539 bv3 = bv1 ^ bv2
540 bv6 = ~bv5
541
542 # Find the length of a bit vector
543 print len( bitvec )
544
545 # Find the integer value of a bit vector
546 print int( bitvec )
547
548 # Open a file for reading bit vectors from
549 bv = BitVector.BitVector( filename = 'TestBitVector/testinput1.txt' )
550 print bv # nothing yet
551 bv1 = bv.read_bits_from_file(64)
552 print bv1 # first 64 bits from the file
553
554 # Divide a bit vector into two equal sub-vectors:
555 [bv1, bv2] = bitvec.divide_into_two()
556
557 # Permute and Un-Permute a bit vector:
558 bv2 = bitvec.permute( permutation_list )
559 bv2 = bitvec.unpermute( permutation_list )
560
561 # Try circular shifts to the left and to the right
562 bitvec << 7
563 bitvec >> 7
564
565 # Try 'if x in y' syntax for bit vectors:
566 bv1 = BitVector( bitstring = '0011001100' )
567 bv2 = BitVector( bitstring = '110011' )
568 if bv2 in bv1:
569 print "%s is in %s" % (bv2, bv1)
570 else:
571 print "%s is not in %s" % (bv2, bv1)
572
573 .....
574 .....
575
576 (For a more complete working example, see the example code in
577 the BitVectorDemo.py file in the Examples sub-directory.)
578
579 '''
580
581
582 import sys
583 import array
584 import exceptions
585 import operator
586
587 _hexdict = { '0' : '0000', '1' : '0001', '2' : '0010', '3' : '0011',
588 '4' : '0100', '5' : '0101', '6' : '0110', '7' : '0111',
589 '8' : '1000', '9' : '1001', 'a' : '1010', 'b' : '1011',
590 'c' : '1100', 'd' : '1101', 'e' : '1110', 'f' : '1111' }
591
593 '''If this function can read all blocksize bits, it peeks ahead to
594 see if there is anything more to be read in the file. It uses
595 tell-read-seek mechanism for this in lines (R18) through (R21). If
596 there is nothing further to be read, it sets the more_to_read
597 attribute of the bitvector object to False. Obviously, this can
598 only be done for seekable streams such as those connected with disk
599 files. According to Blair Houghton, a similar feature could
600 presumably be implemented for socket streams by using recv() or
601 recvfrom() if you set the flags argument to MSG_PEEK.
602 '''
603 global hexdict
604 bitstring = ''
605 i = 0
606 while ( i < blocksize / 8 ):
607 i += 1
608 byte = bitvector.FILEIN.read(1)
609 if byte == '':
610 if len(bitstring) < blocksize:
611 bitvector.more_to_read = False
612 return bitstring
613 hexvalue = hex( ord( byte ) )
614 hexvalue = hexvalue[2:]
615 if len( hexvalue ) == 1:
616 hexvalue = '0' + hexvalue
617 bitstring += _hexdict[ hexvalue[0] ]
618 bitstring += _hexdict[ hexvalue[1] ]
619 file_pos = bitvector.FILEIN.tell()
620
621
622 next_byte = bitvector.FILEIN.read(1)
623 if next_byte:
624
625 bitvector.FILEIN.seek( file_pos )
626 else:
627 bitvector.more_to_read = False
628 return bitstring
629
630
631
632
634
636 if args:
637 raise ValueError(
638 '''BitVector constructor can only be called
639 with keyword arguments for the following
640 keywords: filename, fp (for fileobject),
641 size, intValue, bitlist (for a list or
642 tuple of bits, or bitstring)''')
643 filename = fp = intVal = size = bitlist = bitstring = None
644 if kwargs.has_key('filename'):filename=kwargs.pop('filename')
645 if kwargs.has_key('fp'): fp = kwargs.pop('fp')
646 if kwargs.has_key('size'): size = kwargs.pop('size')
647 if kwargs.has_key('intVal'): intVal = kwargs.pop('intVal')
648 if kwargs.has_key('bitlist'):
649 bitlist = kwargs.pop('bitlist')
650 if kwargs.has_key('bitstring') :
651 bitstring = kwargs.pop('bitstring')
652 self.filename = None
653 self.size = 0
654
655 if filename:
656 if fp or size or intVal or bitlist or bitstring:
657 raise ValueError(
658 '''When filename is specified, you cannot
659 give values to any other constructor args''')
660 self.filename = filename
661 self.FILEIN = open( filename, 'rb' )
662 self.more_to_read = True
663 return
664 elif fp:
665 if filename or size or intVal or bitlist or bitstring:
666 raise ValueError(
667 '''When fileobject is specified, you cannot
668 give values to any other constructor args''')
669 bits = self.read_bits_from_fileobject( fp )
670 bitlist = map( int, bits )
671 self.size = len( bitlist )
672 elif intVal or intVal == 0:
673 if filename or fp or bitlist or bitstring:
674 raise ValueError(
675 '''When intVal is specified, you can only give
676 a value to the 'size' constructor arg''')
677 if intVal == 0:
678 bitlist = [0]
679 self.size = 1
680 else:
681 hexVal = hex( intVal ).lower().rstrip('l')
682 hexVal = hexVal[2:]
683 if len( hexVal ) == 1:
684 hexVal = '0' + hexVal
685 bitlist = ''.join(map(lambda x: _hexdict[x],hexVal))
686 bitlist = map( int, bitlist )
687 i = 0
688 while ( i < len( bitlist ) ):
689 if bitlist[i] == 1: break
690 i += 1
691 del bitlist[0:i]
692 if not size:
693 self.size = len( bitlist )
694 else:
695 if size < len(bitlist):
696 raise ValueError(
697 '''The value specified for size must be
698 at least as large as for the smallest
699 bit vector possible for intVal''')
700 n = size - len(bitlist)
701 bitlist = [0]*n + bitlist
702 self.size = len( bitlist )
703 elif size >= 0:
704 if filename or fp or intVal or bitlist or bitstring:
705 raise ValueError(
706 '''When size is specified (without an intVal),
707 you cannot give values to any other
708 constructor args''')
709 self.size = size
710 bitlist = tuple( [0] * size )
711 elif bitstring or bitstring == '':
712 if filename or fp or size or intVal or bitlist:
713 raise ValueError(
714 '''When a bitstring is specified, you cannot
715 give values to any other constructor args''')
716 bitlist = map( int, list(bitstring) )
717 self.size = len( bitlist )
718 elif bitlist:
719 if filename or fp or size or intVal or bitstring:
720 raise ValueError(
721 '''When bits are specified, you cannot
722 give values to any other constructor args''')
723 self.size = len( bitlist )
724 else:
725 raise ValueError("wrong arg(s) for constructor")
726 two_byte_ints_needed = (len(bitlist) + 15) // 16
727 self.vector = array.array( 'H', [0]*two_byte_ints_needed )
728 map( self._setbit, enumerate(bitlist), bitlist)
729
730
732 'Set the bit at the designated position to the value shown'
733 if val not in (0, 1):
734 raise ValueError( "incorrect value for a bit" )
735 if isinstance( posn, (tuple) ):
736 posn = posn[0]
737 if posn >= self.size or posn < -self.size:
738 raise ValueError( "index range error" )
739 if posn < 0: posn = self.size + posn
740 block_index = posn // 16
741 shift = posn & 15
742 cv = self.vector[block_index]
743 if ( cv >> shift ) & 1 != val:
744 self.vector[block_index] = cv ^ (1 << shift)
745
746
748 'Get the bit from the designated position'
749 if posn >= self.size or posn < -self.size:
750 raise ValueError( "index range error" )
751 if posn < 0: posn = self.size + posn
752 return ( self.vector[posn//16] >> (posn&15) ) & 1
753
754
756 '''
757 Take a bitwise 'xor' of the bit vector on which
758 the method is invoked with the argument bit vector.
759 Return the result as a new bit vector. If the two
760 bit vectors are not of the same size, pad the shorter
761 one with zero's from the left.
762 '''
763 if self.size < other.size:
764 bv1 = self._resize_pad_from_left(other.size - self.size)
765 bv2 = other
766 else:
767 bv1 = self
768 bv2 = other._resize_pad_from_left(self.size - other.size)
769 res = BitVector( size = bv1.size )
770 res.vector = map(operator.__xor__, bv1.vector, bv2.vector)
771 return res
772
773
775 '''
776 Take a bitwise 'and' of the bit vector on which the method is
777 invoked with the argument bit vector. Return the result as a
778 new bit vector. If the two bit vectors are not of the same
779 size, pad the shorter one with zero's from the left.
780 '''
781 if self.size < other.size:
782 bv1 = self._resize_pad_from_left(other.size - self.size)
783 bv2 = other
784 else:
785 bv1 = self
786 bv2 = other._resize_pad_from_left(self.size - other.size)
787 res = BitVector( size = bv1.size )
788 res.vector = map(operator.__and__, bv1.vector, bv2.vector)
789 return res
790
791
793 '''
794 Take a bitwise 'or' of the bit vector on which the
795 method is invoked with the argument bit vector. Return
796 the result as a new bit vector. If the two bit vectors
797 are not of the same size, pad the shorter one with
798 zero's from the left.
799 '''
800 if self.size < other.size:
801 bv1 = self._resize_pad_from_left(other.size - self.size)
802 bv2 = other
803 else:
804 bv1 = self
805 bv2 = other._resize_pad_from_left(self.size - other.size)
806 res = BitVector( size = bv1.size )
807 res.vector = map( operator.__or__, bv1.vector, bv2.vector)
808 return res
809
810
812 '''
813 Invert the bits in the bit vector on which the
814 method is invoked and return the result as a new
815 bit vector.
816 '''
817 res = BitVector( size = self.size )
818 res.vector = map( operator.__inv__, self.vector )
819 return res
820
821
823 '''
824 Concatenate the argument bit vector with the bit
825 vector on which the method is invoked. Return the
826 concatenated bit vector as a new BitVector object.
827 '''
828 i = 0
829 outlist = []
830 while ( i < self.size ):
831 outlist.append( self[i] )
832 i += 1
833 i = 0
834 while ( i < other.size ):
835 outlist.append( other[i] )
836 i += 1
837 return BitVector( bitlist = outlist )
838
839
841 'Return the number of bits in a bit vector.'
842 return self.size
843
844
846 '''
847 Read blocksize bits from a disk file and return a
848 BitVector object containing the bits. If the file
849 contains fewer bits than blocksize, construct the
850 BitVector object from however many bits there are
851 in the file. If the file contains zero bits, return
852 a BitVector object of size attribute set to 0.
853 '''
854 error_str = '''You need to first construct a BitVector
855 object with a filename as argument'''
856 if not self.filename:
857 raise SyntaxError( error_str )
858 if blocksize % 8 != 0:
859 raise ValueError( "block size must be a multiple of 8" )
860 bitstr = _readblock( blocksize, self )
861 if len( bitstr ) == 0:
862 return BitVector( size = 0 )
863 else:
864 return BitVector( bitstring = bitstr )
865
866
867
869 '''
870 This function is meant to read a bit string from a
871 file like object.
872 '''
873 bitlist = []
874 while 1:
875 bit = fp.read()
876 if bit == '': return bitlist
877 bitlist += bit
878
879
881 '''
882 This function is meant to write a bit vector directly to
883 a file like object. Note that whereas 'write_to_file'
884 method creates a memory footprint that corresponds exactly
885 to the bit vector, the 'write_bits_to_fileobject' actually
886 writes out the 1's and 0's as individual items to the
887 file object. That makes this method convenient for
888 creating a string representation of a bit vector,
889 especially if you use the StringIO class, as shown in
890 the test code.
891 '''
892 for bit_index in range(self.size):
893 if self[bit_index] == 0:
894 fp.write( '0' )
895 else:
896 fp.write( '1' )
897
898
900 '''
901 Divides an even-sized bit vector into two and returns
902 the two halves as a list of two bit vectors.
903 '''
904 if self.size % 2 != 0:
905 raise ValueError( "must have even num bits" )
906 i = 0
907 outlist1 = []
908 while ( i < self.size /2 ):
909 outlist1.append( self[i] )
910 i += 1
911 outlist2 = []
912 while ( i < self.size ):
913 outlist2.append( self[i] )
914 i += 1
915 return [ BitVector( bitlist = outlist1 ),
916 BitVector( bitlist = outlist2 ) ]
917
918
920 '''
921 Permute a bit vector according to the indices
922 shown in the second argument list. Return the
923 permuted bit vector as a new bit vector.
924 '''
925 if max(permute_list) > self.size -1:
926 raise ValueError( "Bad permutation index" )
927 outlist = []
928 i = 0
929 while ( i < len( permute_list ) ):
930 outlist.append( self[ permute_list[i] ] )
931 i += 1
932 return BitVector( bitlist = outlist )
933
934
936 '''
937 Unpermute the bit vector according to the
938 permutation list supplied as the second argument.
939 If you first permute a bit vector by using permute()
940 and then unpermute() it using the same permutation
941 list, you will get back the original bit vector.
942 '''
943 if max(permute_list) > self.size -1:
944 raise exceptions.ValueError, "Bad permutation index"
945 if self.size != len( permute_list ):
946 raise exceptions.ValueError,"Bad size for permute list"
947 out_bv = BitVector( size = self.size )
948 i = 0
949 while ( i < len(permute_list) ):
950 out_bv[ permute_list[i] ] = self[i]
951 i += 1
952 return out_bv
953
954
956 '''
957 (Contributed by Joe Davidson) Write the bitvector
958 to the file object file_out. (A file object is
959 returned by a call to open()). Since all file I/O
960 is byte oriented, the bitvector must be multiple
961 of 8 bits. Each byte treated as MSB first (0th index).
962 '''
963 err_str = '''Only a bit vector whose length is a multiple of 8
964 can be written to a file. Use the padding functions
965 to satisfy this constraint.'''
966 if self.size % 8:
967 raise exceptions.ValueError, err_str
968 for byte in range(self.size/8 ):
969 value = 0
970 for bit in range(8):
971 value += (self._getbit( byte*8 + (7 - bit) ) << bit )
972 file_out.write( chr(value) )
973
974
976 '''
977 For closing a file object that was used for reading
978 the bits into one or more BitVector objects.
979 '''
980 if not self.filename:
981 raise exceptions.SyntaxError, "No associated open file"
982 self.FILEIN.close()
983
984
986 'Return the integer value of a bitvector'
987 intVal = 0
988 for i in range(self.size):
989 intVal += self[i] * (2 ** (self.size - i - 1))
990 return intVal
991
992
994 'For an in-place left circular shift by n bit positions'
995 for i in range(n):
996 self.circular_rotate_left_by_one()
997
998
1000 'For an in-place right circular shift by n bit positions.'
1001 for i in range(n):
1002 self.circular_rotate_right_by_one()
1003
1004
1006 'For a one-bit in-place left circular shift'
1007 size = len(self.vector)
1008 bitstring_leftmost_bit = self.vector[0] & 1
1009 left_most_bits = map(operator.__and__, self.vector, [1]*size)
1010 left_most_bits.append(left_most_bits[0])
1011 del(left_most_bits[0])
1012 self.vector = map(operator.__rshift__, self.vector, [1]*size)
1013 self.vector = map( operator.__or__, self.vector, \
1014 map(operator.__lshift__, left_most_bits, [15]*size) )
1015 self._setbit(self.size -1, bitstring_leftmost_bit)
1016
1017
1019 'For a one-bit in-place right circular shift'
1020 size = len(self.vector)
1021 bitstring_rightmost_bit = self[self.size - 1]
1022 right_most_bits = map( operator.__and__,
1023 self.vector, [0x8000]*size )
1024 map( operator.__and__, self.vector, [~0x8000]*size )
1025 right_most_bits.insert(0, bitstring_rightmost_bit)
1026 right_most_bits.pop()
1027 self.vector = map(operator.__lshift__, self.vector, [1]*size)
1028 self.vector = map( operator.__or__, self.vector, \
1029 map(operator.__rshift__, right_most_bits, [15]*size) )
1030 self._setbit(0, bitstring_rightmost_bit)
1031
1032
1034 '''
1035 This is merely another implementation of the method
1036 circular_rotate_left_by_one() shown above. This one
1037 does NOT use map functions. This method carries out a
1038 one-bit left circular shift of a bit vector.
1039 '''
1040 max_index = (self.size -1) // 16
1041 left_most_bit = self.vector[0] & 1
1042 self.vector[0] = self.vector[0] >> 1
1043 for i in range(1, max_index + 1):
1044 left_bit = self.vector[i] & 1
1045 self.vector[i] = self.vector[i] >> 1
1046 self.vector[i-1] |= left_bit << 15
1047 self._setbit(self.size -1, left_most_bit)
1048
1049
1051 '''
1052 This is merely another implementation of the method
1053 circular_rotate_right_by_one() shown above. This one
1054 does NOT use map functions. This method does a one-bit
1055 right circular shift of a bit vector.
1056 '''
1057 max_index = (self.size -1) // 16
1058 right_most_bit = self[self.size - 1]
1059 self.vector[max_index] &= ~0x8000
1060 self.vector[max_index] = self.vector[max_index] << 1
1061 for i in range(max_index-1, -1, -1):
1062 right_bit = self.vector[i] & 0x8000
1063 self.vector[i] &= ~0x8000
1064 self.vector[i] = self.vector[i] << 1
1065 self.vector[i+1] |= right_bit >> 15
1066 self._setbit(0, right_most_bit)
1067
1068
1069
1070 __getitem__ = _getbit
1071 __setitem__ = _setbit
1072
1073
1075 'Allow slicing with [i:j], [:], etc.'
1076 slicebits = []
1077 if j > self.size: j = self.size
1078 for x in range(i,j):
1079 slicebits.append( self[x] )
1080 return BitVector( bitlist = slicebits )
1081
1082
1083
1084 __len__ = _getsize
1085
1086 __int__ = intValue
1087
1089 '''
1090 To allow iterations over a bit vector by supporting the
1091 'for bit in bit_vector' syntax:
1092 '''
1093 return BitVectorIterator( self )
1094
1095
1097 'To create a print representation'
1098 if self.size == 0:
1099 return ''
1100 return ''.join( map( str, self ) )
1101
1102
1103
1104
1106 if self.size != other.size:
1107 return False
1108 i = 0
1109 outlist = []
1110 while ( i < self.size ):
1111 if (self[i] != other[i]): return False
1112 i += 1
1113 return True
1115 return not self == other
1117 return self.intValue() < other.intValue()
1119 return self.intValue() <= other.intValue()
1121 return self.intValue() > other.intValue()
1123 return self.intValue() >= other.intValue()
1124
1125
1126
1128 'Make a deep copy of a bit vector'
1129 copy = str( self )
1130 return BitVector( bitstring = copy )
1131
1132
1134 '''
1135 Resize a bit vector by padding with n 0's
1136 from the left. Return the result as a new bit
1137 vector.
1138 '''
1139 new_str = '0'*n + str( self )
1140 return BitVector( bitstring = new_str )
1141
1142
1144 '''
1145 Resize a bit vector by padding with n 0's
1146 from the right. Return the result as a new bit
1147 vector.
1148 '''
1149 new_str = str( self ) + '0'*n
1150 return BitVector( bitstring = new_str )
1151
1152
1154 'Pad a bit vector with n zeros from the left'
1155 new_str = '0'*n + str( self )
1156 bitlist = map( int, list(new_str) )
1157 self.size = len( bitlist )
1158 two_byte_ints_needed = (len(bitlist) + 15) // 16
1159 self.vector = array.array( 'H', [0]*two_byte_ints_needed )
1160 map( self._setbit, enumerate(bitlist), bitlist)
1161
1162
1164 'Pad a bit vector with n zeros from the right'
1165 new_str = str( self ) + '0'*n
1166 bitlist = map( int, list(new_str) )
1167 self.size = len( bitlist )
1168 two_byte_ints_needed = (len(bitlist) + 15) // 16
1169 self.vector = array.array( 'H', [0]*two_byte_ints_needed )
1170 map( self._setbit, enumerate(bitlist), bitlist)
1171
1172
1174 '''
1175 This supports 'if x in y' and 'if x not in y'
1176 syntax for bit vectors.
1177 '''
1178 if self.size == 0:
1179 raise ValueError, "First arg bitvec has no bits"
1180 elif self.size < otherBitVec.size:
1181 raise ValueError, "First arg bitvec too short"
1182 max_index = self.size - otherBitVec.size + 1
1183 for i in range(max_index):
1184 testbv = self[i:i+otherBitVec.size]
1185 if self[i:i+otherBitVec.size] == otherBitVec:
1186 return True
1187 return False
1188
1189
1190
1191
1194 self.items = []
1195 for i in range( bitvec.size ):
1196 self.items.append( bitvec._getbit(i) )
1197 self.index = -1
1199 return self
1201 self.index += 1
1202 if self.index < len( self.items ):
1203 return self.items[ self.index ]
1204 else:
1205 raise StopIteration
1206
1207
1208
1209
1210
1211
1212
1213 if __name__ == '__main__':
1214
1215
1216 bv1 = BitVector( size = 0 )
1217 print bv1
1218
1219
1220 bv2 = BitVector( size = 2 )
1221 print bv2
1222
1223
1224 print bv1 + bv2
1225
1226
1227 bv = BitVector( bitlist = (1, 0, 0, 1) )
1228 print bv
1229
1230
1231 bv = BitVector( bitlist = [1, 1, 0, 1] )
1232 print bv
1233
1234
1235 bv = BitVector( intVal = 5678 )
1236 print bv
1237 bv = BitVector( intVal = 0 )
1238 print bv
1239 bv = BitVector( intVal = 2 )
1240 print bv
1241 bv = BitVector( intVal = 3 )
1242 print bv
1243 bv = BitVector( intVal = 123456 )
1244 print bv
1245 print bv.intValue()
1246 print int( bv )
1247
1248
1249 import StringIO
1250 x = "111100001111"
1251 fp_read = StringIO.StringIO( x )
1252 bv = BitVector( fp = fp_read )
1253 print bv
1254
1255
1256 bv = BitVector( bitstring = '00110011' )
1257 print bv
1258
1259 bv = BitVector( bitstring = '' )
1260 print bv
1261
1262
1263 print bv.intValue()
1264
1265
1266 bv = BitVector( bitstring = '110001' )
1267 print bv[0], bv[1], bv[2], bv[3], bv[4], bv[5]
1268 print bv[-1], bv[-2], bv[-3], bv[-4], bv[-5], bv[-6]
1269
1270
1271
1272 bv = BitVector( bitstring = '1111' )
1273 print bv
1274 bv[0]=0;bv[1]=0;bv[2]=0;bv[3]=0
1275 print bv
1276 bv[-1]=1;bv[-2]=1;bv[-4]=1
1277 print bv
1278
1279
1280 bv1 = BitVector( bitstring = '00110011' )
1281 bv2 = BitVector( bitlist = [0,0,1,1,0,0,1,1] )
1282 print bv1 == bv2
1283 print bv1 != bv2
1284 print bv1 < bv2
1285 print bv1 <= bv2
1286 bv3 = BitVector( intVal = 5678 )
1287 print bv3.intValue()
1288 print bv3
1289 print bv1 == bv3
1290 print bv3 > bv1
1291 print bv3 >= bv1
1292
1293
1294
1295 fp_write = StringIO.StringIO()
1296 bv.write_bits_to_fileobject( fp_write )
1297 print fp_write.getvalue()
1298
1299
1300 bv3 = bv1 | bv2
1301 print bv3
1302 bv3 = bv1 & bv2
1303 print bv3
1304 bv3 = bv1 + bv2
1305 print bv3
1306 bv4 = BitVector( size = 3 )
1307 print bv4
1308 bv5 = bv3 + bv4
1309 print bv5
1310 bv6 = ~bv5
1311 print bv6
1312 bv7 = bv5 & bv6
1313 print bv7
1314 bv7 = bv5 | bv6
1315 print bv7
1316
1317
1318 print BitVector( intVal = 6 ) ^ BitVector( intVal = 13 )
1319 print BitVector( intVal = 6 ) & BitVector( intVal = 13 )
1320 print BitVector( intVal = 6 ) | BitVector( intVal = 13 )
1321
1322 print BitVector( intVal = 1 ) ^ BitVector( intVal = 13 )
1323 print BitVector( intVal = 1 ) & BitVector( intVal = 13 )
1324 print BitVector( intVal = 1 ) | BitVector( intVal = 13 )
1325
1326
1327 bv7[7] = 0
1328 print bv7
1329 print len( bv7 )
1330 bv8 = (bv5 & bv6) ^ bv7
1331 print bv8
1332
1333
1334 bv = BitVector( bitlist= [0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1] )
1335 print bv
1336
1337
1338 bv = BitVector( filename = 'TestBitVector/testinput1.txt' )
1339 print bv
1340 bv1 = bv.read_bits_from_file(64)
1341 print bv1
1342
1343 bv2 = bv.read_bits_from_file(64)
1344 print bv2
1345
1346 bv3 = bv1 ^ (bv2)
1347 print bv3
1348
1349
1350
1351 [bv4, bv5] = bv3.divide_into_two()
1352 print bv4
1353 print bv5
1354
1355
1356 bv1 = BitVector( bitlist = [1, 0, 0, 1, 1, 0, 1] )
1357 print bv1
1358
1359 bv2 = bv1.permute( [6, 2, 0, 1] )
1360 print bv2
1361 bv3 = BitVector( bitlist = [1, 1, 0, 0, 0, 1, 1] )
1362 print bv3
1363 bv4 = bv1 & bv3
1364 print bv4
1365 print
1366
1367
1368 bv = BitVector( filename = 'TestBitVector/testinput4.txt' )
1369 while (bv.more_to_read):
1370 bv_read = bv.read_bits_from_file( 64 )
1371 print bv_read
1372 print
1373
1374
1375
1376
1377 bv.close_file_object()
1378 bv = BitVector( filename = 'TestBitVector/testinput4.txt' )
1379 bv1 = bv.read_bits_from_file(64)
1380 print bv1
1381 FILEOUT = open( 'TestBitVector/testinput5.txt', 'w' )
1382 bv1.write_to_file( FILEOUT )
1383 FILEOUT.close()
1384
1385
1386
1387
1388 bv2 = bv1.permute( [22, 47, 33, 36, 18, 6, 32, 29, 54, 62, 4,
1389 9, 42, 39, 45, 59, 8, 50, 35, 20, 25, 49,
1390 15, 61, 55, 60, 0, 14, 38, 40, 23, 17, 41,
1391 10, 57, 12, 30, 3, 52, 11, 26, 43, 21, 13,
1392 58, 37, 48, 28, 1, 63, 2, 31, 53, 56, 44, 24,
1393 51, 19, 7, 5, 34, 27, 16, 46] )
1394 print bv2
1395
1396 bv3 = bv2.unpermute( [22, 47, 33, 36, 18, 6, 32, 29, 54, 62, 4,
1397 9, 42, 39, 45, 59, 8, 50, 35, 20, 25, 49,
1398 15, 61, 55, 60, 0, 14, 38, 40, 23, 17, 41,
1399 10, 57, 12, 30, 3, 52, 11, 26, 43, 21, 13,
1400 58, 37, 48, 28, 1, 63, 2, 31, 53, 56, 44, 24,
1401 51, 19, 7, 5, 34, 27, 16, 46] )
1402 print bv3
1403
1404 print
1405 print
1406
1407
1408 print bv3
1409 bv3 << 7
1410 print bv3
1411 bv3 >> 7
1412 print bv3
1413
1414
1415 print len( bv3 )
1416
1417
1418 bv4 = bv3[5:22]
1419 print bv4
1420
1421
1422 for item in bv4:
1423 print item,
1424 print
1425
1426
1427 bv = BitVector( bitstring = '101010' )
1428 bv.pad_from_left( 4 )
1429 print bv
1430
1431 bv.pad_from_right( 4 )
1432 print bv
1433
1434
1435 try:
1436 bv1 = BitVector( bitstring = '0011001100' )
1437 bv2 = BitVector( bitstring = '110011' )
1438 if bv2 in bv1:
1439 print "%s is in %s" % (bv2, bv1)
1440 else:
1441 print "%s is not in %s" % (bv2, bv1)
1442 except ValueError, arg:
1443 print "Error Message: " + str(arg)
1444
1445
1446
1447 bv = BitVector( intVal = 45, size = 16 )
1448 print bv
1449