/*
* The MIT License (MIT)
*
* Copyright (c) 2007-2014 Daniel Alievsky, AlgART Laboratory (http://algart.net)
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
package net.algart.arrays;
import net.algart.math.functions.ConstantFunc;
import net.algart.math.Range;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicLong;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;
/**
* <p>Universal converter of bitwise operation (an algorithm processing {@link BitArray})
* to operation over any primitive type (an algorithm processing {@link PArray}).</p>
*
* <p>This class implements the following common idea. Let we have some algorithm,
* that transforms the source bit array ({@link BitArray}) <b>b</b> to another bit array ƒ(<b>b</b>).
* (Here is an interface {@link GeneralizedBitProcessing.SliceOperation} representing such algorithm.)
* Let we want to generalize this algorithm for a case of any element types — <tt>byte</tt>, <tt>int</tt>,
* <tt>double</tt>, etc.; in other words, for the common case of {@link PArray}.
* This class allows to do this.</p>
*
* <p><a name="bitslice"></a>Namely, let we have the source array ({@link PArray}) <b>a</b>,
* and let <i>a<sub>min</sub></i>..<i>a<sub>max</sub></i> be some numeric range,
* <i>a<sub>min</sub></i>≤<i>a<sub>max</sub></i>
* usually (but not necessarily) from the minimal to the maximal value, stored in the source array
* ({@link Arrays#rangeOf(PArray) Arrays.rangeOf(<b>a</b>)}).
* This class can work in two modes, called <i>rounding modes</i>
* (and represented by {@link GeneralizedBitProcessing.RoundingMode} enum):
* the first mode is called <i>round-down mode</i>, and the second is called <i>round-up mode</i>.
* Below is the specification of the behaviour in both modes.</p>
*
* <div align="center">
* <table width="90%" border="1" cellpadding="8" cellspacing="0">
* <tr>
* <td width="50%"><b>Round-down mode</b></td>
* <td width="50%"><b>Round-up mode</b></td>
* </tr>
* <tr>
* <td colspan="2">
* <p>In both modes, we consider <i>n</i>+1 (<i>n</i>≥0) numeric <i>thresholds</i>
* <nobr><i>t</i><sub>0</sub>, <i>t</i><sub>1</sub>, ..., <i>t</i><sub><i>n</i></sub> in
* <i>a<sub>min</sub></i>..<i>a<sub>max</sub></i> range:</p>
* <blockquote>
* <i>t</i><sub>0</sub> = <i>a<sub>min</sub></i>,<br>
* <i>t</i><sub>1</sub> =
* <i>a<sub>min</sub></i> + (<i>a<sub>max</sub></i>−<i>a<sub>min</sub></i>)/<i>n</i>,<br>
* . . .<br>
* <i>t</i><sub><i>i</i></sub> =
* <i>a<sub>min</sub></i> + <i>i</i> * (<i>a<sub>max</sub></i>−<i>a<sub>min</sub></i>)/<i>n</i>,<br>
* . . .<br>
* <i>t</i><sub><i>n</i></sub> = <i>a<sub>max</sub></i>
* </blockquote>
* </td>
* </tr>
* <tr>
* <td>(In the degenerated case <i>n</i>=0, we consider
* <i>t</i><sub>0</sub> = <i>t</i><sub><i>n</i></sub> = <i>a<sub>min</sub></i>.)
* </td>
* <td>(In the degenerated case <i>n</i>=0, we consider
* <i>t</i><sub>0</sub> = <i>t</i><sub><i>n</i></sub> = <i>a<sub>max</sub></i>.)
* </td>
* </tr>
* <tr>
* <td>
* <p>Then we define the <i>bit slice</i> <b>b</b><sub><i>i</i></sub>, <i>i</i>=0,1,...,<i>n</i>,
* as a bit array <b>a</b>≥<i>t</i><sub><i>i</i></sub>, i.e. {@link BitArray} with the same length as <b>a</b>,
* where each element</p>
* <blockquote>
* <b>b</b><sub><i>i</i></sub>[k] = <b>a</b>[k]≥<i>t</i><sub><i>i</i></sub> ? 1 : 0
* </blockquote>
* </td>
* <td>
* <p>Then we define the <i>bit slice</i> <b>b</b><sub><i>i</i></sub>, <i>i</i>=0,1,...,<i>n</i>,
* as a bit array <b>a</b>><i>t</i><sub><i>i</i></sub>, i.e. {@link BitArray} with the same length as <b>a</b>,
* where each element</p>
* <blockquote>
* <b>b</b><sub><i>i</i></sub>[k] = <b>a</b>[k]><i>t</i><sub><i>i</i></sub> ? 1 : 0
* </blockquote>
* </td>
* </tr>
* <tr>
* <td colspan="2">
* The described transformation of the numeric array <b>a</b> to a set of <i>n</i>+1 "slices"
* (bit arrays) <b>b</b><sub><i>i</i></sub> is called <i>splitting to slices</i>.
* Then we consider the backward conversion of the set of bit slices
* <b>b</b><sub><i>i</i></sub> to a numeric array <b>a'</b>, called <i>joining the slices</i>:
* </td>
* </tr>
* <tr>
* <td>
* <blockquote>
* <b>a'</b>[k] = <i>t</i><sub>max <i>i</i>: <b>b</b><sub><i>i</i></sub>[k] = 1</sub>,
* or <b>a'</b>[k] = <i>a<sub>min</sub></i> if all <b>b</b><sub><i>i</i></sub>[k] = 0
* </blockquote>
* </td>
* <td>
* <blockquote>
* <b>a'</b>[k] = <i>t</i><sub>min <i>i</i>: <b>b</b><sub><i>i</i></sub>[k] = 0</sub>,
* or <b>a'</b>[k] = <i>a<sub>max</sub></i> if all <b>b</b><sub><i>i</i></sub>[k] = 1
* </blockquote>
* </td>
* </tr>
* </table>
* </div>
*
* <p>It's obvious that if all <b>a</b> elements are inside <i>a<sub>min</sub></i>..<i>a<sub>max</sub></i>
* range and if <i>n</i> is large enough, then <b>a'</b> is almost equal to <b>a</b>.
* In particular, if <b>a</b> is a byte array ({@link ByteArray}), <i>a<sub>min</sub></i>=0,
* <i>a<sub>max</sub></i>=255 and <i>n</i>=255, then <b>a'</b> is strictly equal to <b>a</b>
* (in both rounding modes).</p>
*
* <p>The main task, solved by this class, is converting any bitwise operation ƒ(<b>b</b>)
* to a new operation <i>g</i>(<b>a</b>), defining for any primitive-type array <b>a</b>,
* according to the following rule:</p>
*
* <div align="center">
* <table width="90%" border="1" cellpadding="8" cellspacing="0">
* <tr>
* <td width="50%"><b>Round-down mode</b></td>
* <td width="50%"><b>Round-up mode</b></td>
* </tr>
* <tr>
* <td>
* <blockquote>
* <i>g</i>(<b>a</b>)[k] = <i>t</i><sub>max <i>i</i>: ƒ(<b>b</b><sub><i>i</i></sub>)[k] = 1</sub>,
* or <i>g</i>(<b>a</b>)[k] = <i>a<sub>min</sub></i> if all ƒ(<b>b</b><sub><i>i</i></sub>)[k] = 0
* </blockquote>
* </td>
* <td>
* <blockquote>
* <i>g</i>(<b>a</b>)[k] = <i>t</i><sub>min <i>i</i>: ƒ(<b>b</b><sub><i>i</i></sub>)[k] = 0</sub>,
* or <i>g</i>(<b>a</b>)[k] = <i>a<sub>max</sub></i> if all ƒ(<b>b</b><sub><i>i</i></sub>)[k] = 0
* </blockquote>
* </td>
* </tr>
* <tr>
* <td colspan="2">
* In other words, the source array is split into bit slices, the bitwise operation is applied to all slices,
* and then we perform the backward conversion (joining) of slices set to a numeric array <i>g</i>(<b>a</b>).
* </td>
* </tr>
* </table>
* </div>
*
* <p>The conversion of the source array <b>a</b> to the target array <b>c</b>=<i>g</i>(<b>a</b>)
* is performed by the main
* method of this class: {@link #process(UpdatablePArray c, PArray a, Range range, long numberOfSlices)}.
* The <i>a<sub>min</sub></i>..<i>a<sub>max</sub></i> range and the number of slices
* <tt>numberOfSlices</tt>=<i>n</i>+1 are specified as arguments of this method.
* The original bitwise algorithm should be specified while creating an instance of this class by
* {@link #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, GeneralizedBitProcessing.RoundingMode)
* getInstance} method.</p>
*
* <p>Note that the described operation does not require to calculate ƒ(<b>b</b><sub>0</sub>)
* in the round-down mode or ƒ(<b>b</b><sub><i>n</i></sub>) in the round-up mode:
* the result does not depend on it. Also note that the case <i>n</i>=0 is degenerated:
* in this case always
* <nobr><i>g</i>(<b>a</b>)[k] = <i>a<sub>min</sub></i></nobr> (round-down mode)
* or <i>a<sub>max</sub></i> (round-up mode).</p>
*
* <p>Additional useful note: for some kinds of bitwise algorithms, you can improve the precision of the results
* by replacing (after calling {@link #process(UpdatablePArray, PArray, Range, long) process} method)
* the resulting array <b>c</b>=<i>g</i>(<b>a</b>)
* with elementwise maximum, in case of round-down mode, or elementwise minimum, in case of round-up mode,
* of <b>c</b> and the source array <b>a</b>: <nobr><b>c</b>=max(<b>c</b>,<b>a</b>)</nobr> or
* <nobr><b>c</b>=min(<b>c</b>,<b>a</b>)</nobr> correspondingly.
* You can do it easily by
* {@link Arrays#applyFunc(ArrayContext, net.algart.math.functions.Func, UpdatablePArray, PArray...)} method
* with {@link net.algart.math.functions.Func#MAX Func.MAX} or {@link net.algart.math.functions.Func#MIN Func.MIN}
* argument.</p>
*
* <p>This class allocates, in {@link #process(UpdatablePArray, PArray, Range, long) process} method, some temporary
* {@link UpdatableBitArray bit arrays}. Arrays are allocated with help of the memory model,
* returned by <tt>context.{@link ArrayContext#getMemoryModel() getMemoryModel()}</tt> method
* of the <tt>context</tt>, specified while creating an instance of this class.
* If the context is <tt>null</tt>, or if necessary amount of memory is less than
* {@link Arrays.SystemSettings#maxTempJavaMemory()}, then {@link SimpleMemoryModel} is used
* for allocating temporary arrays. There is a special case when
* {@link #process(UpdatablePArray, PArray, Range, long) process} method
* is called for bit arrays (element type is <tt>boolean</tt>); in this case, no AlgART arrays
* are allocated.
*
* <p>When this class allocates temporary AlgART arrays, it also checks whether the passed (source and destination)
* arrays are <i>tiled</i>, i.e. they are underlying arrays of some matrices, created by
* {@link Matrix#tile(long...)} or {@link Matrix#tile()} method.
* Only the case, when these methods are implemented in this package, is recognized.
* In this case, if the <tt>src</tt> array, passed to {@link #process(UpdatablePArray, PArray, Range, long) process}
* method, is tiled, then the temporary AlgART arrays are tiled with the same tile structure.
*
* <p>This class uses multithreading (alike {@link Arrays#copy(ArrayContext, UpdatableArray, Array)}
* and similar methods) to optimize calculations on multiprocessor or multi-core computers.
* Namely, the {@link #process(UpdatablePArray, PArray, Range, long) process} method
* processes different bit slices in several parallel threads.
* However, it is not useful if the bitwise processing method {@link
* GeneralizedBitProcessing.SliceOperation#processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int)}
* already use multithreading optimization. In this case, please create an instance of this class via
* {@link #getSingleThreadInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation,
* GeneralizedBitProcessing.RoundingMode) getSingleThreadInstance} method.</p>
*
* <p>This class is not thread-safe, but <b>is thread-compatible</b>
* and can be synchronized manually, if multithread access is necessary.
* However, usually there are no reasons to use the same instance of this class in different threads:
* usually there is much better idea to create a separate instance for every thread.</p>
*
* <p>AlgART Laboratory 2007–2014</p>
*
* @author Daniel Alievsky
* @version 1.2
* @since JDK 1.5
*/
public class GeneralizedBitProcessing extends AbstractArrayProcessorWithContextSwitching {
/**
* Rounding mode, in which {@link GeneralizedBitProcessing} class works: see comments to that class.
*/
public enum RoundingMode {
/**
* Round-down mode.
*/
ROUND_DOWN,
/**
* Round-up mode.
*/
ROUND_UP
}
/**
* Algorithm of processing bit arrays, that should be generalized for another element types via
* {@link GeneralizedBitProcessing} class.
*/
public static interface SliceOperation {
/**
* Processes the source bit array <tt>srcBits</tt> and saves the results in <tt>destBits</tt> bit array.
* This method is called by
* {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long)} method
* for every {@link GeneralizedBitProcessing bit slice}</a> of the source non-bit array.
* It is the main method, which you should implement to generalize some bit algorithm for non-bit case.
*
* <p>The <tt>destBits</tt> and <tt>srcBits</tt> arrays will be different bit arrays, allocated by
* {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long) process} method
* (according to the memory model, recommended by the {@link GeneralizedBitProcessing#context() context}),
* if the {@link #isInPlaceProcessingAllowed()} method returns <tt>false</tt>.
* If that method returns <tt>true</tt>, the <tt>destBits</tt> and <tt>srcBits</tt> arguments
* will be references to the same bit array.
*
* <p>The index <i>i</i> of the slice, processed by this method, is passed via <tt>sliceIndex</tt> argument.
* In other words, the threshold, corresponding to this slice, is
*
* <blockquote>
* <i>t</i><sub><i>i</i></sub> =
* <i>a<sub>min</sub></i> + <i>i</i> * (<i>a<sub>max</sub></i>−<i>a<sub>min</sub></i>)/<i>n</i>,
* <i>i</i> = <tt>sliceIndex</tt>
* </blockquote
*
* <p>(here <i>a<sub>min</sub></i>..<i>a<sub>max</sub></i> is the range of processed values and
* <i>n</i>+1 is the desired number of slices, passed via <tt>range</tt> and <tt>numberOfSlices</tt>
* arguments of
* {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long) process} method).
* In the round-down mode <tt>sliceIndex</tt> is always in range <tt>1..</tt><i>n</i>,
* and in the round-up mode it is always in range <tt>0..</tt><i>n</i><tt>-1</tt>.
* The slice with index <i>i</i>=0 (round-down mode) or <i>i</i>=<i>n</i> (round-up mode)
* is never processed, because the end result does not depend on it.
* See comments to {@link GeneralizedBitProcessing} class for more details.
*
* <p>Please note that this method can be called simultaneously in different threads,
* when {@link GeneralizedBitProcessing} class uses multithreading optimization.
* In this case, <tt>threadIndex</tt> argument will be the index of the thread,
* in which this method is called, i.e. an integer in
*
* <blockquote>
* <tt>0..min(</tt><i>n</i><tt>-1,{@link GeneralizedBitProcessing#numberOfTasks()}−1)</tt>
* </blockquote>
*
* <p>range, where <i>n</i>+1 is the desired number of slices.
* The high bound of this range, increased by 1, is also passed via <tt>numberOfThreads</tt> argument.
* The <tt>threadIndex</tt> can be very useful if your algorithm requires some work memory or other objects:
* in this case, your should allocate different work memory for different threads.
*
* <p>If multithreading optimization is not used, in particular, if the arrays, processed by
* {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long) process} method,
* are {@link BitArray bit arrays}, then <tt>threadIndex=0</tt> and <tt>numberOfThreads=1</tt>.
*
* @param context the context of execution. It will be <tt>null</tt>, if (and only if)
* the same argument of
* {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long)
* process} method is <tt>null</tt>; in this case, the context should be ignored.
* The main purpose of the context is to allow interruption of this method via
* {@link ArrayContext#checkInterruption()} and to allocate
* work memory via {@link ArrayContext#getMemoryModel()}.
* @param destBits the destination bit array, where results of processing should be stored.
* @param srcBits the source bit array for processing.
* @param sliceIndex the index of the currently processed slice,
* from <tt>1</tt> to <i>n</i> in the round-down mode or
* from <tt>0</tt> to <i>n</i><tt>-1</tt> the round-up mode
* (<i>n</i> is the desired number of slices minus 1).
* @param threadIndex the index of the current thread (different for different threads in a case of
* multithreading optimization).
* @param numberOfThreads the maximal possible value of <tt>threadIndex+1</tt>: equal to
* <nobr><tt>min(numberOfSlices−1,{@link
* GeneralizedBitProcessing#numberOfTasks()})</tt></nobr>,
* where <tt>numberOfSlices</tt>=<i>n</i>+1 is the argument of
* {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long)
* process} method.
* @throws NullPointerException if <tt>srcBits</tt> or <tt>destBits</tt> argument is <tt>null</tt>.
*/
public void processBits(ArrayContext context, UpdatableBitArray destBits, BitArray srcBits,
long sliceIndex, int threadIndex, int numberOfThreads);
/**
* Indicates whether this algorithm can work in place.
*
* <p>Some algorithms, processing bit arrays, can work in place, when the results are stored in the source
* array. In this case, this method should return <tt>true</tt>. If it returns <tt>true</tt>,
* {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long)} method
* saves memory and time by passing the same bit array as <tt>destBits</tt> and <tt>srcBits</tt> arguments
* of {@link #processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int) processBits} method.
* If it returns <tt>false</tt>,
* {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long)} method
* allocates 2 different bit arrays for source and resulting bit arrays and passes
* them as <tt>destBits</tt> and <tt>srcBits</tt> arguments
* of {@link #processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int) processBits} method.
*
* @return <tt>true</tt> if {@link #processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int)
* processBits} method can work correctly when <tt>destBits==srcBits</tt> or <tt>false</tt>
* if that method requires different source and destination arrays.
*/
public boolean isInPlaceProcessingAllowed();
}
private final ThreadPoolFactory threadPoolFactory;
private final int numberOfTasks;
private final SliceOperation sliceOperation;
private final RoundingMode roundingMode;
private final boolean inPlaceProcessingAllowed;
private GeneralizedBitProcessing(ArrayContext context,
SliceOperation sliceOperation, RoundingMode roundingMode, int numberOfTasks)
{
super(context);
if (sliceOperation == null)
throw new NullPointerException("Null sliceOperation argument");
if (roundingMode == null)
throw new NullPointerException("Null roundingMode argument");
this.threadPoolFactory = Arrays.getThreadPoolFactory(context);
this.numberOfTasks = numberOfTasks > 0 ? numberOfTasks :
Math.max(1, this.threadPoolFactory.recommendedNumberOfTasks());
this.sliceOperation = sliceOperation;
this.roundingMode = roundingMode;
this.inPlaceProcessingAllowed = sliceOperation.isInPlaceProcessingAllowed();
}
/**
* Returns new instance of this class.
*
* @param context the {@link #context() context} that will be used by this object;
* may be <tt>null</tt>, then it will be ignored, and all temporary arrays
* will be created by {@link SimpleMemoryModel}.
* @param sliceOperation the bit processing operation that will be generalized by this instance.
* @param roundingMode the rounding mode, used by the created instance.
* @return new instance of this class.
* @throws NullPointerException if <tt>sliceOperation</tt> or <tt>roundingMode</tt> argument is <tt>null</tt>.
* @see #getSingleThreadInstance(ArrayContext, SliceOperation, net.algart.arrays.GeneralizedBitProcessing.RoundingMode)
*/
public static GeneralizedBitProcessing getInstance(ArrayContext context,
SliceOperation sliceOperation, RoundingMode roundingMode)
{
return new GeneralizedBitProcessing(context, sliceOperation, roundingMode, 0);
}
/**
* Returns new instance of this class, that does not use multithreading optimization.
* Usually this class performs calculations in many threads
* (different slides in different threads), according to information from the passed context,
* but an instance, created by this method, does not perform this.
* It is the best choice if the operation, implemented by passed <tt>sliceOperation</tt> object,
* already uses multithreading.
*
* @param context the {@link #context() context} that will be used by this object;
* may be <tt>null</tt>, then it will be ignored, and all temporary arrays
* will be created by {@link SimpleMemoryModel}.
* @param sliceOperation the bit processing operation that will be generalized by this instance.
* @param roundingMode the rounding mode, used by the created instance.
* @return new instance of this class.
* @throws NullPointerException if <tt>sliceOperation</tt> or <tt>roundingMode</tt> argument is <tt>null</tt>.
* @see #getInstance(ArrayContext, SliceOperation, net.algart.arrays.GeneralizedBitProcessing.RoundingMode)
*/
public static GeneralizedBitProcessing getSingleThreadInstance(ArrayContext context,
SliceOperation sliceOperation, RoundingMode roundingMode)
{
return new GeneralizedBitProcessing(context, sliceOperation, roundingMode, 1);
}
/**
* Switches the context: returns an instance, identical to this one excepting
* that it uses the specified <tt>newContext</tt> for all operations.
* The returned instance is a clone of this one, but there is no guarantees
* that it is a deep clone.
* Usually, the returned instance is used only for performing a
* {@link ArrayContext#part(double, double) subtask} of the full task.
*
* @param newContext another context, used by the returned instance; may be <tt>null</tt>.
* @return new instance with another context.
*/
@Override
public GeneralizedBitProcessing context(ArrayContext newContext) {
return newContext == this.context() ? this :
new GeneralizedBitProcessing(newContext, this.sliceOperation, this.roundingMode, this.numberOfTasks);
}
/**
* Returns the bit processing algorithm, used by this instance.
* More precisely, this method just returns a reference to the <tt>sliceOperation</tt> object,
* passed to {@link
* #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, GeneralizedBitProcessing.RoundingMode)
* getInstance} or {@link #getSingleThreadInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation,
* GeneralizedBitProcessing.RoundingMode) getSingleThreadInstance} method.
*
* @return the bit processing algorithm, used by this instance.
*/
public SliceOperation sliceOperation() {
return this.sliceOperation;
}
/**
* Returns the rounding mode, used by this instance.
* More precisely, this method just returns a reference to the <tt>roundingMode</tt> object,
* passed to {@link
* #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, GeneralizedBitProcessing.RoundingMode)
* getInstance} or {@link #getSingleThreadInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation,
* GeneralizedBitProcessing.RoundingMode) getSingleThreadInstance} method.
*
* @return the roundingMode, used by this instance.
*/
public RoundingMode roundingMode() {
return this.roundingMode;
}
/**
* Returns the number of threads, that this class uses for multithreading optimization.
* It is equal to:
* <ul>
* <li>1 if this instance was created by
* {@link #getSingleThreadInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation,
* GeneralizedBitProcessing.RoundingMode) getSingleThreadInstance} method or</li>
* <li><tt>context.{@link ArrayContext#getThreadPoolFactory()
* getThreadPoolFactory()}.{@link ThreadPoolFactory#recommendedNumberOfTasks() recommendedNumberOfTasks()}</tt>
* if it was created by {@link #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation,
* GeneralizedBitProcessing.RoundingMode) getInstance} method.</li>
* </ul>
*
* <p>(As in {@link Arrays#copy(ArrayContext, UpdatableArray, Array)}, if the <tt>context</tt> argument of
* <nobr>{@link
* #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, GeneralizedBitProcessing.RoundingMode)
* getInstance}</nobr> method
* is <tt>null</tt>, then {@link DefaultThreadPoolFactory} is used.)
*
* <p>Note that the real number of parallel threads will be a minimum from this value and
* <i>n</i>, where <i>n</i>+1 is the desired number of slices (the last argument of
* {@link #process(UpdatablePArray, PArray, Range, long)} method).
*
* @return the number of threads, that this class uses for multithreading optimization.
*/
public int numberOfTasks() {
return this.numberOfTasks;
}
/**
* Performs processing of the source array <tt>src</tt>, with saving results in <tt>dest</tt> array,
* on the base of the bit processing algorithm, specified for this instance.
* Namely, the source array <tt>src</tt> is splitted to bit slices, each slice is processed by
* {@link
* GeneralizedBitProcessing.SliceOperation#processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int)}
* method of the {@link #sliceOperation()} object (representing the used bit processing algorithm),
* and the processed slices are joined to the resulting array <tt>dest</tt>.
* See the precise description of this generalization of a bit processing algorithm
* in the {@link GeneralizedBitProcessing comments to this class}.
*
* <p>The <tt>src</tt> and <tt>dest</tt> arrays must not be the same array or views of the same array;
* in other case, the results will be incorrect.
* These arrays must have the same element types and the same lengths; in other case,
* an exception will occur.
*
* <p>The <tt>range</tt> argument specifies the <i>a<sub>min</sub></i>..<i>a<sub>max</sub></i> range,
* used for splitting to bit slices.
* If you do not want to lose too little or too big values in the processed array, this range should
* contain all possible values of <tt>src</tt> array.
* The simplest way to provide this is using the result of {@link Arrays#rangeOf(PArray) Arrays.rangeOf(src)}.
*
* <p>The <tt>numberOfSlices</tt> argument is the number of bit slices, used for splitting the <tt>src</tt> array,
* which is equal to <i>n</i>+1 in the {@link GeneralizedBitProcessing comments to this class}.
* Less values of this argument increases speed, larger values increases precision of the result
* (only <tt>numberOfSlices</tt> different values are possible for <tt>dest</tt> elements).
* If the element type is a fixed-point type (<tt>src</tt> and <tt>dest</tt> are {@link PFixedArray} instance),
* this argument is automatically truncated to
* <nobr><tt>min(numberOfSlices, (long)(range.{@link Range#size() size()}+1.0))</tt></nobr>,
* because larger values cannot increase the precision.
*
* <p>Please remember that <tt>numberOfSlices=1</tt> (<i>n</i>=0) is a degenerated case: in this case,
* the <tt>dest</tt> array is just filled by <tt>range.min()</tt>
* (round-down mode) or <tt>range.min()</tt> (round-up mode),
* alike in {@link UpdatablePArray#fill(double)} method.
*
* <p>If the element type is <tt>boolean</tt> ({@link BitArray}), then a generalization of the bitwise algorithm
* is not necessary. In this case, if <tt>numberOfSlices>1</tt>, this method just calls
* <tt>{@link #sliceOperation()}.{@link
* GeneralizedBitProcessing.SliceOperation#processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int)
* processBits}</tt> method for the passed <tt>dest</tt> and <tt>src</tt> arrays.
* (However, according to the specification of {@link GeneralizedBitProcessing.SliceOperation}, if its
* {@link GeneralizedBitProcessing.SliceOperation#isInPlaceProcessingAllowed() isInPlaceProcessingAllowed()}
* method returns <tt>true</tt>, then <tt>src</tt> arrays is firstly copied into <tt>dest</tt>,
* and then the <tt>dest</tt> arrays is passed as both <tt>srcBits</tt> and <tt>destBits</tt> arguments.)
*
* <p>Note: if the element type is <tt>boolean</tt>, then multithreading is never used:
* {@link
* GeneralizedBitProcessing.SliceOperation#processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int)
* processBits} method is called
* in the current thread, and its <tt>threadIndex</tt> and <tt>numberOfThreads</tt> arguments are
* <tt>0</tt> and <tt>1</tt> correspondingly.
*
* @param dest the result of processing.
* @param src the source AlgART array.
* @param range the <i>a<sub>min</sub></i>..<i>a<sub>max</sub></i> range,
* used for splitting to bit slices.
* @param numberOfSlices the number of bit slices (i.e. <i>n</i>+1); must be positive.
* @throws NullPointerException if one of the arguments is <tt>null</tt>.
* @throws IllegalArgumentException if <tt>dest</tt> and <tt>src</tt> arrays have different lengths
* or element types, or if <tt>numberOfSlices<=0</tt>.
*/
public void process(UpdatablePArray dest, PArray src, Range range, long numberOfSlices) {
if (dest == null)
throw new NullPointerException("Null dest argument");
if (src == null)
throw new NullPointerException("Null src argument");
if (range == null)
throw new NullPointerException("Null range argument");
if (src.elementType() != dest.elementType())
throw new IllegalArgumentException("Different element types of src and dest arrays ("
+ src.elementType().getCanonicalName() + " and " + dest.elementType().getCanonicalName() + ")");
if (dest.length() != src.length())
throw new SizeMismatchException("Different lengths of src and dest arrays ("
+ src.length() + " and " + dest.length() + ")");
if (numberOfSlices <= 0)
throw new IllegalArgumentException("numberOfSlices must be positive");
if (src instanceof PFixedArray) {
numberOfSlices = Math.min(numberOfSlices, (long)(range.size() + 1.0));
}
if (numberOfSlices >= 2 && src instanceof BitArray) {
if (inPlaceProcessingAllowed) {
Arrays.copy(contextPart(0.0, 0.05), dest, src);
this.sliceOperation.processBits(contextPart(0.05, 1.0),
(UpdatableBitArray)dest, (BitArray)dest, 1, 0, 1);
} else {
this.sliceOperation.processBits(context(), (UpdatableBitArray)dest, (BitArray)src, 1, 0, 1);
}
return;
}
long numberOfRanges = numberOfSlices - 1;
// numberOfSlices thresholds split the range into numberOfRanges ranges:
// range.min(), range.min() + range.size() / numberOfRanges, ..., range.max()
ArrayContext acFill = numberOfRanges == 0 ? context() : contextPart(0.0, 0.1 / numberOfRanges);
ConstantFunc filler;
switch (roundingMode) {
case ROUND_DOWN:
filler = ConstantFunc.getInstance(range.min());
break;
case ROUND_UP:
filler = ConstantFunc.getInstance(range.max());
break;
default:
throw new AssertionError();
}
Arrays.copy(acFill, dest, Arrays.asIndexFuncArray(filler, dest.type(), dest.length()));
// equivalent to dest.fill(range.min()), but supports context (important for a case of very large disk array)
if (numberOfRanges == 0) { // degenerated case: numberOfSlices == 1
return;
}
assert numberOfSlices >= 2;
Runnable[] tasks = createTasks(contextPart(0.1 / numberOfRanges, 1.0), dest, src, range, numberOfRanges);
threadPoolFactory.performTasks(tasks);
}
private Runnable[] createTasks(ArrayContext context,
final UpdatablePArray dest, final PArray src, final Range range, final long numberOfRanges)
{
long len = src.length();
assert dest.length() == len;
assert dest.elementType() == src.elementType();
assert numberOfRanges >= 1 : "1 slice must be processed by trivial filling out of this method";
int nt = (int)Math.min(this.numberOfTasks, numberOfRanges);
UpdatableBitArray[] srcBits = allocateMemory(nt, src);
UpdatableBitArray[] destBits = this.inPlaceProcessingAllowed ? srcBits : allocateMemory(nt, src);
Runnable[] tasks = new Runnable[nt];
AtomicLong readyLayers = new AtomicLong(0);
AtomicBoolean interruptionRequest = new AtomicBoolean(false);
if (context != null && nt > 1) {
context = context.singleThreadVersion();
}
Lock lock = new ReentrantLock();
Condition condition = lock.newCondition();
for (int threadIndex = 0; threadIndex < tasks.length; threadIndex++) {
SliceProcessingTask task = new SliceProcessingTask(context, threadIndex, nt, lock, condition);
task.setProcessedData(dest, src, destBits, srcBits);
task.setRanges(range, numberOfRanges);
task.setSynchronizationVariables(readyLayers, interruptionRequest);
tasks[threadIndex] = task;
}
return tasks;
}
private UpdatableBitArray[] allocateMemory(int numberOfTasks, PArray src) {
long nba = this.sliceOperation.isInPlaceProcessingAllowed() ?
numberOfTasks :
2 * (long)numberOfTasks;
long arrayLength = src.length();
MemoryModel mm = Arrays.sizeOf(boolean.class, arrayLength) < Arrays.SystemSettings.maxTempJavaMemory() / nba ?
SimpleMemoryModel.getInstance() :
memoryModel();
UpdatableBitArray[] result = new UpdatableBitArray[numberOfTasks];
Matrix<? extends PArray> srcBaseMatrix = !Arrays.isTiled(src) ? null :
((ArraysTileMatrixImpl.TileMatrixArray)src).baseMatrix().cast(PArray.class);
for (int k = 0; k < result.length; k++) {
result[k] = mm.newUnresizableBitArray(arrayLength);
if (srcBaseMatrix != null) {
result[k] = srcBaseMatrix.matrix(result[k]).tile(Arrays.tileDimensions(src)).array();
}
}
return result;
}
private class SliceProcessingTask implements Runnable {
private final ArrayContext context;
private final int threadIndex;
private final int numberOfThreads;
private final Lock lock;
private final Condition condition;
private Range range;
private long numberOfRanges;
private AtomicLong readyLayers;
private AtomicBoolean interruptionRequest;
private UpdatablePArray dest;
private PArray src;
private UpdatableBitArray[] srcBits;
private UpdatableBitArray[] destBits;
private SliceProcessingTask(ArrayContext context,
int threadIndex, int numberOfThreads, Lock lock, Condition condition)
{
this.context = context;
this.threadIndex = threadIndex;
this.numberOfThreads = numberOfThreads;
this.lock = lock;
this.condition = condition;
}
public void setRanges(Range range, long numberOfRanges) {
this.range = range;
this.numberOfRanges = numberOfRanges;
}
private void setProcessedData(UpdatablePArray dest, PArray src,
UpdatableBitArray[] destBits, UpdatableBitArray[] srcBits)
{
this.dest = dest;
this.src = src;
this.destBits = destBits;
this.srcBits = srcBits;
}
private void setSynchronizationVariables(AtomicLong readyLayers, AtomicBoolean interruptionRequest) {
this.readyLayers = readyLayers;
this.interruptionRequest = interruptionRequest;
}
public strictfp void run() {
if (lock == null || range == null || dest == null || src == null)
throw new AssertionError(this + " is not initialized correctly");
for (long sliceIndex = threadIndex + 1; sliceIndex <= numberOfRanges; sliceIndex += numberOfThreads) {
// threadIndex+1 because processing the layer #0 is trivial and should be skipped
double value;
switch (roundingMode) {
case ROUND_DOWN:
value = sliceIndex == numberOfRanges ? range.max() : // to be on the safe side
range.min() + range.size() * (double)sliceIndex / numberOfRanges;
break;
case ROUND_UP:
value = sliceIndex == numberOfRanges ? range.min() : // to be on the safe side
range.max() - range.size() * (double)sliceIndex / numberOfRanges;
break;
default:
throw new AssertionError();
}
ArrayContext acToBit, acOp, acFromBit;
if (context != null) {
ArrayContext ac = context.part(sliceIndex - 1, sliceIndex, numberOfRanges);
acToBit = numberOfThreads == 1 ? ac.part(0.0, 0.05) : context.noProgressVersion();
acOp = numberOfThreads == 1 ? ac.part(0.05, 0.95) : context.noProgressVersion();
acFromBit = ac.part(0.95, 1.0);
} else {
acToBit = acOp = acFromBit = ArrayContext.DEFAULT_SINGLE_THREAD;
// if context==null, then numberOfThreads is the default system recommended number of threads;
// if it is >1, then we need provide single-thread processing in every slice,
// and if it is 1, then "null" context works like DEFAULT_SINGLE_THREAD
}
if (interruptionRequest.get()) {
return;
}
// long t1 = System.nanoTime();
switch (roundingMode) {
case ROUND_DOWN:
Arrays.packBitsGreaterOrEqual(acToBit, srcBits[threadIndex], src, value);
break;
case ROUND_UP:
Arrays.packBitsGreater(acToBit, srcBits[threadIndex], src, value);
break;
}
// Below is a slower code (commented out), equivalent to this "packBitsGreaterOrEqual"
// (but not to "packBitsLessOrEqual"):
//
// Arrays.applyFunc(acToBit,
// net.algart.math.functions.RectangularFunc.getInstance(value,
// Double.POSITIVE_INFINITY, 1.0, 0.0), srcBits[threadIndex], src);
// long t2 = System.nanoTime();
sliceOperation.processBits(acOp, destBits[threadIndex], srcBits[threadIndex],
roundingMode == RoundingMode.ROUND_DOWN ?
sliceIndex :
numberOfRanges - sliceIndex,
threadIndex, numberOfThreads);
// long t3 = System.nanoTime();
lock.lock();
try {
if (interruptionRequest.get()) {
return; // important to avoid writing anywhere if interruption is necessary
}
while (readyLayers.get() < sliceIndex - 1) {
// This loop cannot be infinite.
// Proof.
// Note that sliceIndex takes all values m=1,2,...,numberOfRanges in some order.
// Let's prove by induction with m, that this loop is not infinite when sliceIndex=m.
// 1. If sliceIndex=m=1, it is obvious: readyLayers.get() is never < 0.
// 2. Let's suppose, that this loop was successfully performed and finished in some thread,
// when sliceIndex was equal to m-1. And let now sliceIndex be equal to m.
// Because this loop was successfully performed with sliceIndex = m-1,
// so, readyLayers.get() had become >= m-2. Some time after this,
// readyLayers.incrementAndGet() operator below was executed,
// and readyLayers.get() must become >= m-1.
// At this moment, this loop must finish with sliceIndex = m.
try {
condition.await(100, TimeUnit.MILLISECONDS);
// 100 ms - to be on the safe side, if "signalAll" has not help
} catch (InterruptedException e) {
interruptionRequest.set(true);
return;
}
if (context != null) {
context.checkInterruption();
}
// to be on the safe side, allow interruption even in a case of a bug here
}
switch (roundingMode) {
case ROUND_DOWN:
Arrays.unpackUnitBits(acFromBit, dest, destBits[threadIndex], value);
break;
case ROUND_UP:
Arrays.unpackZeroBits(acFromBit, dest, destBits[threadIndex], value);
break;
}
// Below is a slower code (commented out), equivalent to this "unpackUnitBits"
// (but not to "unpackZeroBits"), which, however,
// does not need waiting for readyLayers.get() == sliceIndex-1:
//
// PArray castBits = Arrays.asFuncArray(
// net.algart.math.functions.LinearFunc.getInstance(0.0, value),
// dest.type(), destBits[threadIndex]);
// Arrays.applyFunc(acFromBit, net.algart.math.functions.Func.MAX,
// dest, dest, castBits);
long ready = readyLayers.incrementAndGet();
if (ready != sliceIndex)
throw new AssertionError("Invalid synchronization in " + GeneralizedBitProcessing.class);
condition.signalAll();
} finally {
lock.unlock();
}
// System.out.printf("%d: %.5f + %.5f + %.5f, %d tasks (%s <-- %s)%n",
// sliceIndex, (t2 - t1) * 1e-6, (t3 - t2) * 1e-6, (System.nanoTime() - t3) * 1e-6,
// numberOfThreads, destBits[threadIndex], srcBits[threadIndex]);
}
}
}
}