@author gvt
We need a picture to understand this little evil:
|----------------|- n ------------------------------------> start | | | | ^^^^ HEAD ^^^^ |^^^^^^^^^^^^^^^^^^^^^^^^^^^> offset | | ^ ^ | | | |----------------|- n+1 | | | ^ ^ | | | . . | | . BODY . | | | | | ^ ^ | length lA |----------------|- q-1 | | | ^ ^ | | | | ^ BODY ^ | | | | ^ ^ | | | |----------------|- q=(n+m+k) | | | ^ ^ | | | | ^^^^ TAIL ^^^^ |^^^^^^^^^^^^^^^^^^^^^^^^^^^> lst | | | | |----------------|- q+1 = (n+m+k+p) = n+nA ----------------> end
where: offset >= 0, length >= 0, align > 0 are given and lst = (offset+length).
The "Buffer" is defined, in terms of offsets of bytes that have to be written to or read from the device, by the set:
U = { i | offset <= i < lst }
and the "Aligned Buffer", that have to be handled on the device is defined by:
A = { i | start <= i < end }
where
start=max { i>=0 | i <= offset, i % align = 0 }
end=min { i>=0 | i >= lst, i % align = 0 }
"A" is the "smallest" aligned buffer that contains U, i.e. start = min(A) and end = max(A) are multiples of "align" and there are no "smallest sets" (in terms of the number of elements o(A)) that contains "U".
Let me define:
n = offset / align ofsR = offset % align m = length / align lenR = length % align q = lst / align lstR = lst % align
where "/" is the euclidean integer division and "%" is the euclidean integer remains (or modulo), the usual Java tokens for these operators.
From the definition of Euclidean division we have:
offset = n * align + ofsR 0 <= ofsR < align length = m * align + lenR 0 <= lenR < align lst = q * align + lstR 0 <= lstR < align
It is very easy to see that:
start = offset - ofsR = n * align
end = lst - lstR + p * align = (n+m+k+p) * align
where / 0, lstR = 0 / 0, (ofsR+lenR) < align p=| k=| \ 1, lstR> 0 \ 1, (ofsR+lenR) >= align
The last equation come from this fact:
lst = offset + length = (n+m)*align + (ofsR+lenR) = = (n+m)*align + (k*align + lstR) = = (n+m+k)*align + lstR i.e. q = n + m + k.
Hence, the "order of A", o(A), the number of elements in A is:
lA = o(A) = (end - start) = (n + m + k + p) * align
and the number of "aligned blocks" in "A" is:
nA = lA/align = n + m + k + p
that is the "order" o(AA), of the set:
AA = { i/align | i % align = 0, start <= i < end } = = { n, n+1, n+2, ..., n+m+k+p-1 }
These considerations establish the basic quantities I need to correcly decide how the input buffer have to be read or written to the device, device that is supposed to be able to handle only aligned blocks, i.e. offsets and length that are intger multiples of "align".
Now, we want to define what are the HEAD "H", the BODY "B" and the TAIL "T" of the "Aligned Buffer". Again they will defined as sets, "0" will be the "empty set" and we will use the notation nH=o(H), nB=o(B), nT=o(T) for the number of their elements.
Let me to get rid of the "EMPTY" trivial, where nothing have to be done (length=0): we define H=0, B=0, T=0. So nB=nA=0.
Although it can be handled as a "BODY" only buffer, we get rid of the "ALIGNED" trivial case too, (ofsR=0 and lstR=0), definining H=0, defining H=0, B=AA, T=0. So nB=nA=m.
Anyway the EMPTY and the ALIGNED cases have to be handled separately in the code for efficiency purpouses.
From now we can suppose (length>0) and (ofsR>0 or lstR>0).
Again we have to handle a special case, the case where the "Buffer" is completely (and properly) "CONTAINED" inside a "single aligned block". The case is identified by the condition "lst <= start + align" and for the assumptions we have done "0 < length < align". For this case it doesn't make any sense to define HEAD, BODY and TAIL but we define as an artifact, for the sake of generality, H=0, B=0, T=0. We have "nB = 0 and nA = 1", so "nB = nA - 1".
From now we can assume that "lst > start + align", i.e. the "Buffer" is "CROSSED", it "cross" the boundary of at least "one aligned block". We can now define:
H = { i/align | i not in U, i % align = 0, start <= i < start + align }
B = { i/align | i not in U, i % align = 0, start + align <= i < end - align }
T = { i/align | i not in U, i % align = 0, end - align <= i < end }
It is easy to see that "AA" is the union of the mutually disjoint sets "H", "B" and "T". So "nA = nH + nB + nT" where:
0 <= nH <= 1 0 <= nT <= 1 0 <= nB <= nA
and that "nB = nA - nH - nT".
The "Aligned Buffer" will:
has an HEAD (nH=1) if and only if ofsR > 0 has a TAIL (nT=1) if and only if lstR > 0
and because we excluded the "ALIGNED" case, it has to have the HEAD or the TAIL because ofsR>0 or lstR>0.
As an example if the "Aligned Buffer" is CROSSED, has an HEAD, a non empty BODY and a TAIL ( k=1, p=1, m>=1 ):
H = { n }, B = { n+1, n+2, ..., n+m }, T = { n+m+1 } A = { n, n+1, n+2, ..., n+m, n+m+1 }
Then: / nA-1, (HEAD) and (no TAIL), ofsR>0 and lstR>0 | nB=| nA-1, (no HEAD) and (TAIL), ofsR=0 and lstR=0 | \ nA-2, (HEAD) and (TAIL), ofsR>0 and lstR>0
In general, using the positions we have done for the EMPTY and ALIGNED case:
/ nA=0, EMPTY , length = 0 | nA=m, ALIGNED , ofsR = 0, lstR = 0 nB = | nA-1, CONTAINED , lst <= start + align | nA-1, CROSSED(H) , lst> start + align, ofsR>0, lstR=0 | nA-1, CROSSED(T) , lst > start + align, ofsR=0, lstR>0 \ nA-2, CROSSED(HT) , lst > start + align, ofsR
Now you can ask "why all of this semi math stuff?" ... ;-) ... well it helps ... believe me ... it actually is a "proof" that the code inside the "BufferType" inner class does a correct job ... I hope ... the constructor of the helper "BufferType" class uses the "(offset,length)" couple to compute some of this quantities and use them to decide if the "Buffer" at the given offset is EMPTY, ALIGNED, CONTAINED or CROSSED and correctly compute nB for all the cases.
The code of the "read", "write" methods uses the type of the buffer, ofsR, lstR and nB to decide how it has to handle the "alignment" of the blocks.
Infact, speaking roughly, beside the trivial "EMPTY", "ALIGNED" cases and the "CONTAINED" case, in the general "CROSSED" case we can optimize read and write for "BODY" of the buffer. We cannot do anything for the CONTAINED case and for the HEAD and the TAIL of the buffer.
The basic idea is to reduce the number of "buffercopy" (read method) and the number of "read/buffercopy" (write method) to the minimum possible number (that actually is nA-nB). At most it will "buffercopy" or "read/buffercopy" 2 blocks (because 0<=nA-nB<=2).
Hope this long comment will help to understand and verify the code and to understand how bad my brain is working ... ;-) ...
gvt