2206 lines
88 KiB
Plaintext
2206 lines
88 KiB
Plaintext
|
------------------------------------------------------------------------
|
||
|
The list of most significant changes made over time in
|
||
|
Intel(R) Threading Building Blocks (Intel(R) TBB).
|
||
|
|
||
|
Intel TBB 2017
|
||
|
TBB_INTERFACE_VERSION == 9100
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.4 Update 5):
|
||
|
|
||
|
- static_partitioner class is now a fully supported feature.
|
||
|
- async_node class is now a fully supported feature.
|
||
|
- Improved dynamic memory allocation replacement on Windows* OS to skip
|
||
|
DLLs for which replacement cannot be done, instead of aborting.
|
||
|
- Intel TBB no longer performs dynamic memory allocation replacement
|
||
|
for Microsoft* Visual Studio* 2008.
|
||
|
- For 64-bit platforms, quadrupled the worst-case limit on the amount
|
||
|
of memory the Intel TBB allocator can handle.
|
||
|
- Added TBB_USE_GLIBCXX_VERSION macro to specify the version of GNU
|
||
|
libstdc++ when it cannot be properly recognized, e.g. when used
|
||
|
with Clang on Linux* OS. Inspired by a contribution from David A.
|
||
|
- Added graph/stereo example to demostrate tbb::flow::async_msg.
|
||
|
- Removed a few cases of excessive user data copying in the flow graph.
|
||
|
- Reworked split_node to eliminate unnecessary overheads.
|
||
|
- Added support for C++11 move semantics to the argument of
|
||
|
tbb::parallel_do_feeder::add() method.
|
||
|
- Added C++11 move constructor and assignment operator to
|
||
|
tbb::combinable template class.
|
||
|
- Added tbb::this_task_arena::max_concurrency() function and
|
||
|
max_concurrency() method of class task_arena returning the maximal
|
||
|
number of threads that can work inside an arena.
|
||
|
- Deprecated tbb::task_arena::current_thread_index() static method;
|
||
|
use tbb::this_task_arena::current_thread_index() function instead.
|
||
|
- All examples for commercial version of library moved online:
|
||
|
https://software.intel.com/en-us/product-code-samples. Examples are
|
||
|
available as a standalone package or as a part of Intel(R) Parallel
|
||
|
Studio XE or Intel(R) System Studio Online Samples packages.
|
||
|
|
||
|
Changes affecting backward compatibility:
|
||
|
|
||
|
- Renamed following methods and types in async_node class:
|
||
|
Old New
|
||
|
async_gateway_type => gateway_type
|
||
|
async_gateway() => gateway()
|
||
|
async_try_put() => try_put()
|
||
|
async_reserve() => reserve_wait()
|
||
|
async_commit() => release_wait()
|
||
|
- Internal layout of some flow graph nodes has changed; recompilation
|
||
|
is recommended for all binaries that use the flow graph.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Added template class streaming_node to the flow graph API. It allows
|
||
|
a flow graph to offload computations to other devices through
|
||
|
streaming or offloading APIs.
|
||
|
- Template class opencl_node reimplemented as a specialization of
|
||
|
streaming_node that works with OpenCL*.
|
||
|
- Added tbb::this_task_arena::isolate() function to isolate execution
|
||
|
of a group of tasks or an algorithm from other tasks submitted
|
||
|
to the scheduler.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Added a workaround for GCC bug #62258 in std::rethrow_exception()
|
||
|
to prevent possible problems in case of exception propagation.
|
||
|
- Fixed parallel_scan to provide correct result if the initial value
|
||
|
of an accumulator is not the operation identity value.
|
||
|
- Fixed a memory corruption in the memory allocator when it meets
|
||
|
internal limits.
|
||
|
- Fixed the memory allocator on 64-bit platforms to align memory
|
||
|
to 16 bytes by default for all allocations bigger than 8 bytes.
|
||
|
- As a workaround for crashes in the Intel TBB library compiled with
|
||
|
GCC 6, added -flifetime-dse=1 to compilation options on Linux* OS.
|
||
|
- Fixed a race in the flow graph implementation.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Enabling use of C++11 'override' keyword by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.4 Update 5
|
||
|
TBB_INTERFACE_VERSION == 9005
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.4 Update 4):
|
||
|
|
||
|
- Modified graph/fgbzip2 example to remove unnecessary data queuing.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Added a Python* module which is able to replace Python's thread pool
|
||
|
class with the implementation based on Intel TBB task scheduler.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed the implementation of 64-bit tbb::atomic for IA-32 architecture
|
||
|
to work correctly with GCC 5.2 in C++11/14 mode.
|
||
|
- Fixed a possible crash when tasks with affinity (e.g. specified via
|
||
|
affinity_partitioner) are used simultaneously with task priority
|
||
|
changes.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.4 Update 4
|
||
|
TBB_INTERFACE_VERSION == 9004
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.4 Update 3):
|
||
|
|
||
|
- Removed a few cases of excessive user data copying in the flow graph.
|
||
|
- Improved robustness of concurrent_bounded_queue::abort() in case of
|
||
|
simultaneous push and pop operations.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Added tbb::flow::async_msg, a special message type to support
|
||
|
communications between the flow graph and external asynchronous
|
||
|
activities.
|
||
|
- async_node modified to support use with C++03 compilers.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed a bug in dynamic memory allocation replacement for Windows* OS.
|
||
|
- Fixed excessive memory consumption on Linux* OS caused by enabling
|
||
|
zero-copy realloc.
|
||
|
- Fixed performance regression on Intel(R) Xeon Phi(tm) coprocessor with
|
||
|
auto_partitioner.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.4 Update 3
|
||
|
TBB_INTERFACE_VERSION == 9003
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.4 Update 2):
|
||
|
|
||
|
- Modified parallel_sort to not require a default constructor for values
|
||
|
and to use iter_swap() for value swapping.
|
||
|
- Added support for creating or initializing a task_arena instance that
|
||
|
is connected to the arena currently used by the thread.
|
||
|
- graph/binpack example modified to use multifunction_node.
|
||
|
- For performance analysis, use Intel(R) VTune(TM) Amplifier XE 2015
|
||
|
and higher; older versions are no longer supported.
|
||
|
- Improved support for compilation with disabled RTTI, by omitting its use
|
||
|
in auxiliary code, such as assertions. However some functionality,
|
||
|
particularly the flow graph, does not work if RTTI is disabled.
|
||
|
- The tachyon example for Android* can be built using Android Studio 1.5
|
||
|
and higher with experimental Gradle plugin 0.4.0.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Added class opencl_subbufer that allows using OpenCL* sub-buffer
|
||
|
objects with opencl_node.
|
||
|
- Class global_control supports the value of 1 for
|
||
|
max_allowed_parallelism.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed a race causing "TBB Warning: setaffinity syscall failed" message.
|
||
|
- Fixed a compilation issue on OS X* with Intel(R) C++ Compiler 15.0.
|
||
|
- Fixed a bug in queuing_rw_mutex::downgrade() that could temporarily
|
||
|
block new readers.
|
||
|
- Fixed speculative_spin_rw_mutex to stop using the lazy subscription
|
||
|
technique due to its known flaws.
|
||
|
- Fixed memory leaks in the tool support code.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.4 Update 2
|
||
|
TBB_INTERFACE_VERSION == 9002
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.4 Update 1):
|
||
|
|
||
|
- Improved interoperability with Intel(R) OpenMP RTL (libiomp) on Linux:
|
||
|
OpenMP affinity settings do not affect the default number of threads
|
||
|
used in the task scheduler. Intel(R) C++ Compiler 16.0 Update 1
|
||
|
or later is required.
|
||
|
- Added a new flow graph example with different implementations of the
|
||
|
Cholesky Factorization algorithm.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Added template class opencl_node to the flow graph API. It allows a
|
||
|
flow graph to offload computations to OpenCL* devices.
|
||
|
- Extended join_node to use type-specified message keys. It simplifies
|
||
|
the API of the node by obtaining message keys via functions
|
||
|
associated with the message type (instead of node ports).
|
||
|
- Added static_partitioner that minimizes overhead of parallel_for and
|
||
|
parallel_reduce for well-balanced workloads.
|
||
|
- Improved template class async_node in the flow graph API to support
|
||
|
user settable concurrency limits.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed a possible crash in the GUI layer for library examples on Linux.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.4 Update 1
|
||
|
TBB_INTERFACE_VERSION == 9001
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.4):
|
||
|
|
||
|
- Added support for Microsoft* Visual Studio* 2015.
|
||
|
- Intel TBB no longer performs dynamic replacement of memory allocation
|
||
|
functions for Microsoft Visual Studio 2005 and earlier versions.
|
||
|
- For GCC 4.7 and higher, the intrinsics-based platform isolation layer
|
||
|
uses __atomic_* built-ins instead of the legacy __sync_* ones.
|
||
|
This change is inspired by a contribution from Mathieu Malaterre.
|
||
|
- Improvements in task_arena:
|
||
|
Several application threads may join a task_arena and execute tasks
|
||
|
simultaneously. The amount of concurrency reserved for application
|
||
|
threads at task_arena construction can be set to any value between
|
||
|
0 and the arena concurrency limit.
|
||
|
- The fractal example was modified to demonstrate class task_arena
|
||
|
and moved to examples/task_arena/fractal.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed a deadlock during destruction of task_scheduler_init objects
|
||
|
when one of destructors is set to wait for worker threads.
|
||
|
- Added a workaround for a possible crash on OS X* when dynamic memory
|
||
|
allocator replacement (libtbbmalloc_proxy) is used and memory is
|
||
|
released during application startup.
|
||
|
- Usage of mutable functors with task_group::run_and_wait() and
|
||
|
task_arena::enqueue() is disabled. An attempt to pass a functor
|
||
|
which operator()() is not const will produce compilation errors.
|
||
|
- Makefiles and environment scripts now properly recognize GCC 5.0 and
|
||
|
higher.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Improved performance of parallel_for_each for inputs allowing random
|
||
|
access, by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.4
|
||
|
TBB_INTERFACE_VERSION == 9000
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.3 Update 6):
|
||
|
|
||
|
- The following features are now fully supported:
|
||
|
tbb::flow::composite_node;
|
||
|
additional policies of tbb::flow::graph_node::reset().
|
||
|
- Platform abstraction layer for Windows* OS updated to use compiler
|
||
|
intrinsics for most atomic operations.
|
||
|
- The tbb/compat/thread header updated to automatically include
|
||
|
C++11 <thread> where available.
|
||
|
- Fixes and refactoring in the task scheduler and class task_arena.
|
||
|
- Added key_matching policy to tbb::flow::join_node, which removes
|
||
|
the restriction on the type that can be compared-against.
|
||
|
- For tag_matching join_node, tag_value is redefined to be 64 bits
|
||
|
wide on all architectures.
|
||
|
- Expanded the documentation for the flow graph with details about
|
||
|
node semantics and behavior.
|
||
|
- Added dynamic replacement of C11 standard function aligned_alloc()
|
||
|
under Linux* OS.
|
||
|
- Added C++11 move constructors and assignment operators to
|
||
|
tbb::enumerable_thread_specific container.
|
||
|
- Added hashing support for tbb::tbb_thread::id.
|
||
|
- On OS X*, binaries that depend on libstdc++ are not provided anymore.
|
||
|
In the makefiles, libc++ is now used by default; for building with
|
||
|
libstdc++, specify stdlib=libstdc++ in the make command line.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Added a new example, graph/fgbzip2, that shows usage of
|
||
|
tbb::flow::async_node.
|
||
|
- Modification to the low-level API for memory pools:
|
||
|
added a function for finding a memory pool by an object allocated
|
||
|
from that pool.
|
||
|
- tbb::memory_pool now does not request memory till the first allocation
|
||
|
from the pool.
|
||
|
|
||
|
Changes affecting backward compatibility:
|
||
|
|
||
|
- Internal layout of flow graph nodes has changed; recompilation is
|
||
|
recommended for all binaries that use the flow graph.
|
||
|
- Resetting a tbb::flow::source_node will immediately activate it,
|
||
|
unless it was created in inactive state.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Failure at creation of a memory pool will not cause process
|
||
|
termination anymore.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Supported building TBB with Clang on AArch64 with use of built-in
|
||
|
intrinsics by David A.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.3 Update 6
|
||
|
TBB_INTERFACE_VERSION == 8006
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.3 Update 5):
|
||
|
|
||
|
- Supported zero-copy realloc for objects >1MB under Linux* via
|
||
|
mremap system call.
|
||
|
- C++11 move-aware insert and emplace methods have been added to
|
||
|
concurrent_hash_map container.
|
||
|
- install_name is set to @rpath/<library name> on OS X*.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Added template class async_node to the flow graph API. It allows a
|
||
|
flow graph to communicate with an external activity managed by
|
||
|
the user or another runtime.
|
||
|
- Improved speed of flow::graph::reset() clearing graph edges.
|
||
|
rf_extract flag has been renamed rf_clear_edges.
|
||
|
- extract() method of graph nodes now takes no arguments.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- concurrent_unordered_{set,map} behaves correctly for degenerate
|
||
|
hashes.
|
||
|
- Fixed a race condition in the memory allocator that may lead to
|
||
|
excessive memory consumption under high multithreading load.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.3 Update 5
|
||
|
TBB_INTERFACE_VERSION == 8005
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.3 Update 4):
|
||
|
|
||
|
- Added add_ref_count() method of class tbb::task.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Added class global_control for application-wide control of allowed
|
||
|
parallelism and thread stack size.
|
||
|
- memory_pool_allocator now throws the std::bad_alloc exception on
|
||
|
allocation failure.
|
||
|
- Exceptions thrown for by memory pool constructors changed from
|
||
|
std::bad_alloc to std::invalid_argument and std::runtime_error.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- scalable_allocator now throws the std::bad_alloc exception on
|
||
|
allocation failure.
|
||
|
- Fixed a race condition in the memory allocator that may lead to
|
||
|
excessive memory consumption under high multithreading load.
|
||
|
- A new scheduler created right after destruction of the previous one
|
||
|
might be unable to modify the number of worker threads.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- (Added but not enabled) push_front() method of class tbb::task_list
|
||
|
by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.3 Update 4
|
||
|
TBB_INTERFACE_VERSION == 8004
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.3 Update 3):
|
||
|
|
||
|
- Added a C++11 variadic constructor for enumerable_thread_specific.
|
||
|
The arguments from this constructor are used to construct
|
||
|
thread-local values.
|
||
|
- Improved exception safety for enumerable_thread_specific.
|
||
|
- Added documentation for tbb::flow::tagged_msg class and
|
||
|
tbb::flow::output_port function.
|
||
|
- Fixed build errors for systems that do not support dynamic linking.
|
||
|
- C++11 move-aware insert and emplace methods have been added to
|
||
|
concurrent unordered containers.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Interface-breaking change: typedefs changed for node predecessor and
|
||
|
successor lists, affecting copy_predecessors and copy_successors
|
||
|
methods.
|
||
|
- Added template class composite_node to the flow graph API. It packages
|
||
|
a subgraph to represent it as a first-class flow graph node.
|
||
|
- make_edge and remove_edge now accept multiport nodes as arguments,
|
||
|
automatically using the node port with index 0 for an edge.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Draft code for enumerable_thread_specific constructor with multiple
|
||
|
arguments (see above) by Adrien Guinet.
|
||
|
- Fix for GCC invocation on IBM* Blue Gene*
|
||
|
by Jeff Hammond and Raf Schietekat.
|
||
|
- Extended testing with smart pointers for Clang & libc++
|
||
|
by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.3 Update 3
|
||
|
TBB_INTERFACE_VERSION == 8003
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.3 Update 2):
|
||
|
|
||
|
- Move constructor and assignment operator were added to unique_lock.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Time overhead for memory pool destruction was reduced.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Build error fix for iOS* by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.3 Update 2
|
||
|
TBB_INTERFACE_VERSION == 8002
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.3 Update 1):
|
||
|
|
||
|
- Binary files for 64-bit Android* applications were added as part of the
|
||
|
Linux* OS package.
|
||
|
- Exact exception propagation is enabled for Intel C++ Compiler on OS X*.
|
||
|
- concurrent_vector::shrink_to_fit was optimized for types that support
|
||
|
C++11 move semantics.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed concurrent unordered containers to insert elements much faster
|
||
|
in debug mode.
|
||
|
- Fixed concurrent priority queue to support types that do not have
|
||
|
copy constructors.
|
||
|
- Fixed enumerable_thread_specific to forbid copying from an instance
|
||
|
with a different value type.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Support for PathScale* EKOPath* Compiler by Erik Lindahl.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.3 Update 1
|
||
|
TBB_INTERFACE_VERSION == 8001
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.3):
|
||
|
|
||
|
- The ability to split blocked_ranges in a proportion, used by
|
||
|
affinity_partitioner since version 4.2 Update 4, became a formal
|
||
|
extension of the Range concept.
|
||
|
- More checks for an incorrect address to release added to the debug
|
||
|
version of the memory allocator.
|
||
|
- Different kind of solutions for each TBB example were merged.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Task priorities are re-enabled in preview binaries.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed a duplicate symbol when TBB_PREVIEW_VARIADIC_PARALLEL_INVOKE is
|
||
|
used in multiple compilation units.
|
||
|
- Fixed a crash in __itt_fini_ittlib seen on Ubuntu 14.04.
|
||
|
- Fixed a crash in memory release after dynamic replacement of the
|
||
|
OS X* memory allocator.
|
||
|
- Fixed incorrect indexing of arrays in seismic example.
|
||
|
- Fixed a data race in lazy initialization of task_arena.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Fix for dumping information about gcc and clang compiler versions
|
||
|
by Misty De Meo.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.3
|
||
|
TBB_INTERFACE_VERSION == 8000
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.2 Update 5):
|
||
|
|
||
|
- The following features are now fully supported: flow::indexer_node,
|
||
|
task_arena, speculative_spin_rw_mutex.
|
||
|
- Compatibility with C++11 standard improved for tbb/compat/thread
|
||
|
and tbb::mutex.
|
||
|
- C++11 move constructors have been added to concurrent_queue and
|
||
|
concurrent_bounded_queue.
|
||
|
- C++11 move constructors and assignment operators have been added to
|
||
|
concurrent_vector, concurrent_hash_map, concurrent_priority_queue,
|
||
|
concurrent_unordered_{set,multiset,map,multimap}.
|
||
|
- C++11 move-aware emplace/push/pop methods have been added to
|
||
|
concurrent_vector, concurrent_queue, concurrent_bounded_queue,
|
||
|
concurrent_priority_queue.
|
||
|
- Methods to insert a C++11 initializer list have been added:
|
||
|
concurrent_vector::grow_by(), concurrent_hash_map::insert(),
|
||
|
concurrent_unordered_{set,multiset,map,multimap}::insert().
|
||
|
- Testing for compatibility of containers with some C++11 standard
|
||
|
library types has been added.
|
||
|
- Dynamic replacement of standard memory allocation routines has been
|
||
|
added for OS X*.
|
||
|
- Microsoft* Visual Studio* projects for Intel TBB examples updated
|
||
|
to VS 2010.
|
||
|
- For open-source packages, debugging information (line numbers) in
|
||
|
precompiled binaries now matches the source code.
|
||
|
- Debug information was added to release builds for OS X*, Solaris*,
|
||
|
FreeBSD* operating systems and MinGW*.
|
||
|
- Various improvements in documentation, debug diagnostics and examples.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Additional actions on reset of graphs, and extraction of individual
|
||
|
nodes from a graph (TBB_PREVIEW_FLOW_GRAPH_FEATURES).
|
||
|
- Support for an arbitrary number of arguments in parallel_invoke
|
||
|
(TBB_PREVIEW_VARIADIC_PARALLEL_INVOKE).
|
||
|
|
||
|
Changes affecting backward compatibility:
|
||
|
|
||
|
- For compatibility with C++11 standard, copy and move constructors and
|
||
|
assignment operators are disabled for all mutex classes. To allow
|
||
|
the old behavior, use TBB_DEPRECATED_MUTEX_COPYING macro.
|
||
|
- flow::sequencer_node rejects messages with repeating sequence numbers.
|
||
|
- Changed internal interface between tbbmalloc and tbbmalloc_proxy.
|
||
|
- Following deprecated functionality has been removed:
|
||
|
old debugging macros TBB_DO_ASSERT & TBB_DO_THREADING_TOOLS;
|
||
|
no-op depth-related methods in class task;
|
||
|
tbb::deprecated::concurrent_queue;
|
||
|
deprecated variants of concurrent_vector methods.
|
||
|
- register_successor() and remove_successor() are deprecated as methods
|
||
|
to add and remove edges in flow::graph; use make_edge() and
|
||
|
remove_edge() instead.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed incorrect scalable_msize() implementation for aligned objects.
|
||
|
- Flow graph buffering nodes now destroy their copy of forwarded items.
|
||
|
- Multiple fixes in task_arena implementation, including for:
|
||
|
inconsistent task scheduler state inside executed functions;
|
||
|
incorrect floating-point settings and exception propagation;
|
||
|
possible stalls in concurrent invocations of execute().
|
||
|
- Fixed floating-point settings propagation when the same instance of
|
||
|
task_group_context is used in different arenas.
|
||
|
- Fixed compilation error in pipeline.h with Intel Compiler on OS X*.
|
||
|
- Added missed headers for individual components to tbb.h.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Range interface addition to parallel_do, parallel_for_each and
|
||
|
parallel_sort by Stephan Dollberg.
|
||
|
- Variadic template implementation of parallel_invoke
|
||
|
by Kizza George Mbidde (see Preview Features).
|
||
|
- Improvement in Seismic example for MacBook Pro* with Retina* display
|
||
|
by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.2 Update 5
|
||
|
TBB_INTERFACE_VERSION == 7005
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.2 Update 4):
|
||
|
|
||
|
- The second template argument of class aligned_space<T,N> now is set
|
||
|
to 1 by default.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Better support for exception safety, task priorities and floating
|
||
|
point settings in class task_arena.
|
||
|
- task_arena::current_slot() has been renamed to
|
||
|
task_arena::current_thread_index().
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Task priority change possibly ignored by a worker thread entering
|
||
|
a nested parallel construct.
|
||
|
- Memory leaks inside the task scheduler when running on
|
||
|
Intel(R) Xeon Phi(tm) coprocessor.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Improved detection of X Window support for Intel TBB examples
|
||
|
and other feedback by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.2 Update 4
|
||
|
TBB_INTERFACE_VERSION == 7004
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.2 Update 3):
|
||
|
|
||
|
- Added possibility to specify floating-point settings at invocation
|
||
|
of most parallel algorithms (including flow::graph) via
|
||
|
task_group_context.
|
||
|
- Added dynamic replacement of malloc_usable_size() under
|
||
|
Linux*/Android* and dlmalloc_usable_size() under Android*.
|
||
|
- Added new methods to concurrent_vector:
|
||
|
grow_by() that appends a sequence between two given iterators;
|
||
|
grow_to_at_least() that initializes new elements with a given value.
|
||
|
- Improved affinity_partitioner for better performance on balanced
|
||
|
workloads.
|
||
|
- Improvements in the task scheduler, including better scalability
|
||
|
when threads search for a task arena, and better diagnostics.
|
||
|
- Improved allocation performance for workloads that do intensive
|
||
|
allocation/releasing of same-size objects larger than ~8KB from
|
||
|
multiple threads.
|
||
|
- Exception support is enabled by default for 32-bit MinGW compilers.
|
||
|
- The tachyon example for Android* can be built for all targets
|
||
|
supported by the installed NDK.
|
||
|
- Added Windows Store* version of the tachyon example.
|
||
|
- GettingStarted/sub_string_finder example ported to offload execution
|
||
|
on Windows* for Intel(R) Many Integrated Core Architecture.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Removed task_scheduler_observer::on_scheduler_leaving() callback.
|
||
|
- Added task_scheduler_observer::may_sleep() callback.
|
||
|
- The CPF or_node has been renamed indexer_node. The input to
|
||
|
indexer_node is now a list of types. The output of indexer_node is
|
||
|
a tagged_msg type composed of a tag and a value. For indexer_node,
|
||
|
the tag is a size_t.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed data races in preview extensions of task_scheduler_observer.
|
||
|
- Added noexcept(false) for destructor of task_group_base to avoid
|
||
|
crash on cancellation of structured task group in C++11.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Improved concurrency detection for BG/Q, and other improvements
|
||
|
by Raf Schietekat.
|
||
|
- Fix for crashes in enumerable_thread_specific in case if a contained
|
||
|
object is too big to be constructed on the stack by Adrien Guinet.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.2 Update 3
|
||
|
TBB_INTERFACE_VERSION == 7003
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.2 Update 2):
|
||
|
|
||
|
- Added support for Microsoft* Visual Studio* 2013.
|
||
|
- Improved Microsoft* PPL-compatible form of parallel_for for better
|
||
|
support of auto-vectorization.
|
||
|
- Added a new example for cancellation and reset in the flow graph:
|
||
|
Kohonen self-organizing map (examples/graph/som).
|
||
|
- Various improvements in source code, tests, and makefiles.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Added dynamic replacement of _aligned_msize() previously missed.
|
||
|
- Fixed task_group::run_and_wait() to throw invalid_multiple_scheduling
|
||
|
exception if the specified task handle is already scheduled.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- A fix for ARM* processors by Steve Capper.
|
||
|
- Improvements in std::swap calls by Robert Maynard.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.2 Update 2
|
||
|
TBB_INTERFACE_VERSION == 7002
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.2 Update 1):
|
||
|
|
||
|
- Enable C++11 features for Microsoft* Visual Studio* 2013 Preview.
|
||
|
- Added a test for compatibility of TBB containers with C++11
|
||
|
range-based for loop.
|
||
|
|
||
|
Changes affecting backward compatibility:
|
||
|
|
||
|
- Internal layout changed for class tbb::flow::limiter_node.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Added speculative_spin_rw_mutex, a read-write lock class which uses
|
||
|
Intel(R) Transactional Synchronization Extensions.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- When building for Intel(R) Xeon Phi(tm) coprocessor, TBB programs
|
||
|
no longer require explicit linking with librt and libpthread.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Fixes for ARM* processors by Steve Capper, Leif Lindholm
|
||
|
and Steven Noonan.
|
||
|
- Support for Clang on Linux by Raf Schietekat.
|
||
|
- Typo correction in scheduler.cpp by Julien Schueller.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.2 Update 1
|
||
|
TBB_INTERFACE_VERSION == 7001
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.2):
|
||
|
|
||
|
- Added project files for Microsoft* Visual Studio* 2010.
|
||
|
- Initial support of Microsoft* Visual Studio* 2013 Preview.
|
||
|
- Enable C++11 features available in Intel(R) C++ Compiler 14.0.
|
||
|
- scalable_allocation_mode(TBBMALLOC_SET_SOFT_HEAP_LIMIT, <size>) can be
|
||
|
used to urge releasing memory from tbbmalloc internal buffers when
|
||
|
the given limit is exceeded.
|
||
|
|
||
|
Preview Features:
|
||
|
|
||
|
- Class task_arena no longer requires linking with a preview library,
|
||
|
though still remains a community preview feature.
|
||
|
- The method task_arena::wait_until_empty() is removed.
|
||
|
- The method task_arena::current_slot() now returns -1 if
|
||
|
the task scheduler is not initialized in the thread.
|
||
|
|
||
|
Changes affecting backward compatibility:
|
||
|
|
||
|
- Because of changes in internal layout of graph nodes, the namespace
|
||
|
interface number of flow::graph has been incremented from 6 to 7.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed a race in lazy initialization of task_arena.
|
||
|
- Fixed flow::graph::reset() to prevent situations where tasks would be
|
||
|
spawned in the process of resetting the graph to its initial state.
|
||
|
- Fixed decrement bug in limiter_node.
|
||
|
- Fixed a race in arc deletion in the flow graph.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Improved support for IBM* Blue Gene* by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.2
|
||
|
TBB_INTERFACE_VERSION == 7000
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.1 Update 4):
|
||
|
|
||
|
- Added speculative_spin_mutex, which uses Intel(R) Transactional
|
||
|
Synchronization Extensions when they are supported by hardware.
|
||
|
- Binary files linked with libc++ (the C++ standard library in Clang)
|
||
|
were added on OS X*.
|
||
|
- For OS X* exact exception propagation is supported with Clang;
|
||
|
it requires use of libc++ and corresponding Intel TBB binaries.
|
||
|
- Support for C++11 initializer lists in constructor and assigment
|
||
|
has been added to concurrent_hash_map, concurrent_unordered_set,
|
||
|
concurrent_unordered_multiset, concurrent_unordered_map,
|
||
|
concurrent_unordered_multimap.
|
||
|
- The memory allocator may now clean its per-thread memory caches
|
||
|
when it cannot get more memory.
|
||
|
- Added the scalable_allocation_command() function for on-demand
|
||
|
cleaning of internal memory caches.
|
||
|
- Reduced the time overhead for freeing memory objects smaller than ~8K.
|
||
|
- Simplified linking with the debug library for applications that use
|
||
|
Intel TBB in code offloaded to Intel(R) Xeon Phi(tm) coprocessors.
|
||
|
See an example in
|
||
|
examples/GettingStarted/sub_string_finder/Makefile.
|
||
|
- Various improvements in source code, scripts and makefiles.
|
||
|
|
||
|
Changes affecting backward compatibility:
|
||
|
|
||
|
- tbb::flow::graph has been modified to spawn its tasks;
|
||
|
the old behaviour (task enqueuing) is deprecated. This change may
|
||
|
impact applications that expected a flow graph to make progress
|
||
|
without calling wait_for_all(), which is no longer guaranteed. See
|
||
|
the documentation for more details.
|
||
|
- Changed the return values of the scalable_allocation_mode() function.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed a leak of parallel_reduce body objects when execution is
|
||
|
cancelled or an exception is thrown, as suggested by Darcy Harrison.
|
||
|
- Fixed a race in the task scheduler which can lower the effective
|
||
|
priority despite the existence of higher priority tasks.
|
||
|
- On Linux an error during destruction of the internal thread local
|
||
|
storage no longer results in an exception.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Fixed task_group_context state propagation to unrelated context trees
|
||
|
by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.1 Update 4
|
||
|
TBB_INTERFACE_VERSION == 6105
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.1 Update 3):
|
||
|
|
||
|
- Use /volatile:iso option with VS 2012 to disable extended
|
||
|
semantics for volatile variables.
|
||
|
- Various improvements in affinity_partitioner, scheduler,
|
||
|
tests, examples, makefiles.
|
||
|
- Concurrent_priority_queue class now supports initialization/assignment
|
||
|
via C++11 initializer list feature (std::initializer_list<T>).
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed more possible stalls in concurrent invocations of
|
||
|
task_arena::execute(), especially waiting for enqueued tasks.
|
||
|
- Fixed requested number of workers for task_arena(P,0).
|
||
|
- Fixed interoperability with Intel(R) VTune(TM) Amplifier XE in
|
||
|
case of using task_arena::enqueue() from a terminating thread.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Type fixes, cleanups, and code beautification by Raf Schietekat.
|
||
|
- Improvements in atomic operations for big endian platforms
|
||
|
by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.1 Update 3
|
||
|
TBB_INTERFACE_VERSION == 6103
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.1 Update 2):
|
||
|
|
||
|
- Binary files for Android* applications were added to the Linux* OS
|
||
|
package.
|
||
|
- Binary files for Windows Store* applications were added to the
|
||
|
Windows* OS package.
|
||
|
- Exact exception propagation (exception_ptr) support on Linux OS is
|
||
|
now turned on by default for GCC 4.4 and higher.
|
||
|
- Stopped implicit use of large memory pages by tbbmalloc (Linux-only).
|
||
|
Now use of large pages must be explicitly enabled with
|
||
|
scalable_allocation_mode() function or TBB_MALLOC_USE_HUGE_PAGES
|
||
|
environment variable.
|
||
|
|
||
|
Community Preview Features:
|
||
|
|
||
|
- Extended class task_arena constructor and method initialize() to
|
||
|
allow some concurrency to be reserved strictly for application
|
||
|
threads.
|
||
|
- New methods terminate() and is_active() were added to class
|
||
|
task_arena.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed initialization of hashing helper constant in the hash
|
||
|
containers.
|
||
|
- Fixed possible stalls in concurrent invocations of
|
||
|
task_arena::execute() when no worker thread is available to make
|
||
|
progress.
|
||
|
- Fixed incorrect calculation of hardware concurrency in the presence
|
||
|
of inactive processor groups, particularly on systems running
|
||
|
Windows* 8 and Windows* Server 2012.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- The fix for the GUI examples on OS X* systems by Raf Schietekat.
|
||
|
- Moved some power-of-2 calculations to functions to improve readability
|
||
|
by Raf Schietekat.
|
||
|
- C++11/Clang support improvements by arcata.
|
||
|
- ARM* platform isolation layer by Steve Capper, Leif Lindholm, Leo Lara
|
||
|
(ARM).
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.1 Update 2
|
||
|
TBB_INTERFACE_VERSION == 6102
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.1 Update 1):
|
||
|
|
||
|
- Objects up to 128 MB are now cached by the tbbmalloc. Previously
|
||
|
the threshold was 8MB. Objects larger than 128 MB are still
|
||
|
processed by direct OS calls.
|
||
|
- concurrent_unordered_multiset and concurrent_unordered_multimap
|
||
|
have been added, based on Microsoft* PPL prototype.
|
||
|
- Ability to value-initialize a tbb::atomic<T> variable on construction
|
||
|
in C++11, with const expressions properly supported.
|
||
|
|
||
|
Community Preview Features:
|
||
|
|
||
|
- Added a possibility to wait until all worker threads terminate.
|
||
|
This is necessary before calling fork() from an application.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed data race in tbbmalloc that might lead to memory leaks
|
||
|
for large object allocations.
|
||
|
- Fixed task_arena::enqueue() to use task_group_context of target arena.
|
||
|
- Improved implementation of 64 bit atomics on ia32.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.1 Update 1
|
||
|
TBB_INTERFACE_VERSION == 6101
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.1):
|
||
|
|
||
|
- concurrent_vector class now supports initialization/assignment
|
||
|
via C++11 initializer list feature (std::initializer_list<T>)
|
||
|
- Added implementation of the platform isolation layer based on
|
||
|
Intel compiler atomic built-ins; it is supposed to work on
|
||
|
any platform supported by compiler version 12.1 and newer.
|
||
|
- Using GetNativeSystemInfo() instead of GetSystemInfo() to support
|
||
|
more than 32 processors for 32-bit applications under WOW64.
|
||
|
- The following form of parallel_for:
|
||
|
parallel_for(first, last, [step,] f[, context]) now accepts an
|
||
|
optional partitioner parameter after the function f.
|
||
|
|
||
|
Backward-incompatible API changes:
|
||
|
|
||
|
- The library no longer injects tuple in to namespace std.
|
||
|
In previous releases, tuple was injected into namespace std by
|
||
|
flow_graph.h when std::tuple was not available. In this release,
|
||
|
flow_graph.h now uses tbb::flow::tuple. On platforms where
|
||
|
std::tuple is available, tbb::flow::tuple is typedef'ed to
|
||
|
std::tuple. On all other platforms, tbb::flow::tuple provides
|
||
|
a subset of the functionality defined by std::tuple. Users of
|
||
|
flow_graph.h may need to change their uses of std::tuple to
|
||
|
tbb::flow::tuple to ensure compatibility with non-C++11 compliant
|
||
|
compilers.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed local observer to be able to override propagated CPU state and
|
||
|
to provide correct value of task_arena::current_slot() in callbacks.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.1
|
||
|
TBB_INTERFACE_VERSION == 6100
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.0 Update 5):
|
||
|
|
||
|
- _WIN32_WINNT must be set to 0x0501 or greater in order to use TBB
|
||
|
on Microsoft* Windows*.
|
||
|
- parallel_deterministic_reduce template function is fully supported.
|
||
|
- TBB headers can be used with C++0x/C++11 mode (-std=c++0x) of GCC
|
||
|
and Intel(R) Compiler.
|
||
|
- C++11 std::make_exception_ptr is used where available, instead of
|
||
|
std::copy_exception from earlier C++0x implementations.
|
||
|
- Improvements in the TBB allocator to reduce extra memory consumption.
|
||
|
- Partial refactoring of the task scheduler data structures.
|
||
|
- TBB examples allow more flexible specification of the thread number,
|
||
|
including arithmetic and geometric progression.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- On Linux & OS X*, pre-built TBB binaries do not yet support exact
|
||
|
exception propagation via C++11 exception_ptr. To prevent run time
|
||
|
errors, by default TBB headers disable exact exception propagation
|
||
|
even if the C++ implementation provides exception_ptr.
|
||
|
|
||
|
Community Preview Features:
|
||
|
|
||
|
- Added: class task_arena, for work submission by multiple application
|
||
|
threads with thread-independent control of concurrency level.
|
||
|
- Added: task_scheduler_observer can be created as local to a master
|
||
|
thread, to observe threads that work on behalf of that master.
|
||
|
Local observers may have new on_scheduler_leaving() callback.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.0 Update 5
|
||
|
TBB_INTERFACE_VERSION == 6005
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.0 Update 4):
|
||
|
|
||
|
- Parallel pipeline optimization (directly storing small objects in the
|
||
|
interstage data buffers) limited to trivially-copyable types for
|
||
|
C++11 and a short list of types for earlier compilers.
|
||
|
- _VARIADIC_MAX switch is honored for TBB tuple implementation
|
||
|
and flow::graph nodes based on tuple.
|
||
|
- Support of Cocoa framework was added to the GUI examples on OS X*
|
||
|
systems.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed a tv_nsec overflow bug in condition_variable::wait_for.
|
||
|
- Fixed execution order of enqueued tasks with different priorities.
|
||
|
- Fixed a bug with task priority changes causing lack of progress
|
||
|
for fire-and-forget tasks when TBB was initialized to use 1 thread.
|
||
|
- Fixed duplicate symbol problem when linking multiple compilation
|
||
|
units that include flow_graph.h on VC 10.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.0 Update 4
|
||
|
TBB_INTERFACE_VERSION == 6004
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.0 Update 3):
|
||
|
|
||
|
- The TBB memory allocator transparently supports large pages on Linux.
|
||
|
- A new flow_graph example, logic_sim, was added.
|
||
|
- Support for DirectX* 9 was added to GUI examples.
|
||
|
|
||
|
Community Preview Features:
|
||
|
|
||
|
- Added: aggregator, a new concurrency control mechanism.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- The abort operation on concurrent_bounded_queue now leaves the queue
|
||
|
in a reusable state. If a bad_alloc or bad_last_alloc exception is
|
||
|
thrown while the queue is recovering from an abort, that exception
|
||
|
will be reported instead of user_abort on the thread on which it
|
||
|
occurred, and the queue will not be reusable.
|
||
|
- Steal limiting heuristic fixed to avoid premature stealing disabling
|
||
|
when large amount of __thread data is allocated on thread stack.
|
||
|
- Fixed a low-probability leak of arenas in the task scheduler.
|
||
|
- In STL-compatible allocator classes, the method construct() was fixed
|
||
|
to comply with C++11 requirements.
|
||
|
- Fixed a bug that prevented creation of fixed-size memory pools
|
||
|
smaller than 2M.
|
||
|
- Significantly reduced the amount of warnings from various compilers.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Multiple improvements by Raf Schietekat.
|
||
|
- Basic support for Clang on OS X* by Blas Rodriguez Somoza.
|
||
|
- Fixes for warnings and corner-case bugs by Blas Rodriguez Somoza
|
||
|
and Edward Lam.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.0 Update 3
|
||
|
TBB_INTERFACE_VERSION == 6003
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.0 Update 2):
|
||
|
|
||
|
- Modifications to the low-level API for memory pools:
|
||
|
added support for aligned allocations;
|
||
|
pool policies reworked to allow backward-compatible extensions;
|
||
|
added a policy to not return memory space till destruction;
|
||
|
pool_reset() does not return memory space anymore.
|
||
|
- Class tbb::flow::graph_iterator added to iterate over all nodes
|
||
|
registered with a graph instance.
|
||
|
- multioutput_function_node has been renamed multifunction_node.
|
||
|
multifunction_node and split_node are now fully-supported features.
|
||
|
- For the tagged join node, the policy for try_put of an item with
|
||
|
already existing tag has been defined: the item will be rejected.
|
||
|
- Matching the behavior on Windows, on other platforms the optional
|
||
|
shared libraries (libtbbmalloc, libirml) now are also searched
|
||
|
only in the directory where libtbb is located.
|
||
|
- The platform isolation layer based on GCC built-ins is extended.
|
||
|
|
||
|
Backward-incompatible API changes:
|
||
|
|
||
|
- a graph reference parameter is now required to be passed to the
|
||
|
constructors of the following flow graph nodes: overwrite_node,
|
||
|
write_once_node, broadcast_node, and the CPF or_node.
|
||
|
- the following tbb::flow node methods and typedefs have been renamed:
|
||
|
Old New
|
||
|
join_node and or_node:
|
||
|
inputs() -> input_ports()
|
||
|
input_ports_tuple_type -> input_ports_type
|
||
|
multifunction_node and split_node:
|
||
|
ports_type -> output_ports_type
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Not all logical processors were utilized on systems with more than
|
||
|
64 cores split by Windows into several processor groups.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.0 Update 2 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 6002
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.0 Update 1 commercial-aligned release):
|
||
|
|
||
|
- concurrent_bounded_queue now has an abort() operation that releases
|
||
|
threads involved in pending push or pop operations. The released
|
||
|
threads will receive a tbb::user_abort exception.
|
||
|
- Added Community Preview Feature: concurrent_lru_cache container,
|
||
|
a concurrent implementation of LRU (least-recently-used) cache.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- fixed a race condition in the TBB scalable allocator.
|
||
|
- concurrent_queue counter wraparound bug was fixed, which occurred when
|
||
|
the number of push and pop operations exceeded ~>4 billion on IA32.
|
||
|
- fixed races in the TBB scheduler that could put workers asleep too
|
||
|
early, especially in presence of affinitized tasks.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.0 Update 1 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 6000 (forgotten to increment)
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 4.0 commercial-aligned release):
|
||
|
|
||
|
- Memory leaks fixed in binpack example.
|
||
|
- Improvements and fixes in the TBB allocator.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 4.0 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 6000
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 3.0 Update 8 commercial-aligned release):
|
||
|
|
||
|
- concurrent_priority_queue is now a fully supported feature.
|
||
|
Capacity control methods were removed.
|
||
|
- Flow graph is now a fully supported feature.
|
||
|
- A new memory backend has been implemented in the TBB allocator.
|
||
|
It can reuse freed memory for both small and large objects, and
|
||
|
returns unused memory blocks to the OS more actively.
|
||
|
- Improved partitioning algorithms for parallel_for and parallel_reduce
|
||
|
to better handle load imbalance.
|
||
|
- The convex_hull example has been refactored for reproducible
|
||
|
performance results.
|
||
|
- The major interface version has changed from 5 to 6.
|
||
|
Deprecated interfaces might be removed in future releases.
|
||
|
|
||
|
Community Preview Features:
|
||
|
|
||
|
- Added: serial subset, i.e. sequential implementations of TBB generic
|
||
|
algorithms (currently, only provided for parallel_for).
|
||
|
- Preview of new flow graph nodes:
|
||
|
or_node (accepts multiple inputs, forwards each input separately
|
||
|
to all successors),
|
||
|
split_node (accepts tuples, and forwards each element of a tuple
|
||
|
to a corresponding successor), and
|
||
|
multioutput_function_node (accepts one input, and passes the input
|
||
|
and a tuple of output ports to the function body to support outputs
|
||
|
to multiple successors).
|
||
|
- Added: memory pools for more control on memory source, grouping,
|
||
|
and collective deallocation.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 3.0 Update 8 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 5008
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 3.0 Update 7 commercial-aligned release):
|
||
|
|
||
|
- Task priorities become an official feature of TBB,
|
||
|
not community preview as before.
|
||
|
- Atomics API extended, and implementation refactored.
|
||
|
- Added task::set_parent() method.
|
||
|
- Added concurrent_unordered_set container.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- PowerPC support by Raf Schietekat.
|
||
|
- Fix of potential task pool overrun and other improvements
|
||
|
in the task scheduler by Raf Schietekat.
|
||
|
- Fix in parallel_for_each to work with std::set in Visual* C++ 2010.
|
||
|
|
||
|
Community Preview Features:
|
||
|
|
||
|
- Graph community preview feature was renamed to flow graph.
|
||
|
Multiple improvements in the implementation.
|
||
|
Binpack example was added for the feature.
|
||
|
- A number of improvements to concurrent_priority_queue.
|
||
|
Shortpath example was added for the feature.
|
||
|
- TBB runtime loaded functionality was added (Windows*-only).
|
||
|
It allows to specify which versions of TBB should be used,
|
||
|
as well as to set directories for the library search.
|
||
|
- parallel_deterministic_reduce template function was added.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 3.0 Update 7 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 5006 (forgotten to increment)
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 3.0 Update 6 commercial-aligned release):
|
||
|
|
||
|
- Added implementation of the platform isolation layer based on
|
||
|
GCC atomic built-ins; it is supposed to work on any platform
|
||
|
where GCC has these built-ins.
|
||
|
|
||
|
Community Preview Features:
|
||
|
|
||
|
- Graph's dining_philosophers example added.
|
||
|
- A number of improvements to graph and concurrent_priority_queue.
|
||
|
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 3.0 Update 6 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 5006
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 3.0 Update 5 commercial-aligned release):
|
||
|
|
||
|
- Added Community Preview feature: task and task group priority, and
|
||
|
Fractal example demonstrating it.
|
||
|
- parallel_pipeline optimized for data items of small and large sizes.
|
||
|
- Graph's join_node is now parametrized with a tuple of up to 10 types.
|
||
|
- Improved performance of concurrent_priority_queue.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Initial NetBSD support by Aleksej Saushev.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Failure to enable interoperability with Intel(R) Cilk(tm) Plus runtime
|
||
|
library, and a crash caused by invoking the interoperability layer
|
||
|
after one of the libraries was unloaded.
|
||
|
- Data race that could result in concurrent_unordered_map structure
|
||
|
corruption after call to clear() method.
|
||
|
- Stack corruption caused by PIC version of 64-bit CAS compiled by Intel
|
||
|
compiler on Linux.
|
||
|
- Inconsistency of exception propagation mode possible when application
|
||
|
built with Microsoft* Visual Studio* 2008 or earlier uses TBB built
|
||
|
with Microsoft* Visual Studio* 2010.
|
||
|
- Affinitizing master thread to a subset of available CPUs after TBB
|
||
|
scheduler was initialized tied all worker threads to the same CPUs.
|
||
|
- Method is_stolen_task() always returned 'false' for affinitized tasks.
|
||
|
- write_once_node and overwrite_node did not immediately send buffered
|
||
|
items to successors
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 3.0 Update 5 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 5005
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 3.0 Update 4 commercial-aligned release):
|
||
|
|
||
|
- Added Community Preview feature: graph.
|
||
|
- Added automatic propagation of master thread FPU settings to
|
||
|
TBB worker threads.
|
||
|
- Added a public function to perform a sequentially consistent full
|
||
|
memory fence: tbb::atomic_fence() in tbb/atomic.h.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Data race that could result in scheduler data structures corruption
|
||
|
when using fire-and-forget tasks.
|
||
|
- Potential referencing of destroyed concurrent_hash_map element after
|
||
|
using erase(accessor&A) method with A acquired as const_accessor.
|
||
|
- Fixed a correctness bug in the convex hull example.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Patch for calls to internal::atomic_do_once() by Andrey Semashev.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 3.0 Update 4 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 5004
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 3.0 Update 3 commercial-aligned release):
|
||
|
|
||
|
- Added Community Preview feature: concurrent_priority_queue.
|
||
|
- Fixed library loading to avoid possibility for remote code execution,
|
||
|
see http://www.microsoft.com/technet/security/advisory/2269637.mspx.
|
||
|
- Added support of more than 64 cores for appropriate Microsoft*
|
||
|
Windows* versions. For more details, see
|
||
|
http://msdn.microsoft.com/en-us/library/dd405503.aspx.
|
||
|
- Default number of worker threads is adjusted in accordance with
|
||
|
process affinity mask.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Calls of scalable_* functions from inside the allocator library
|
||
|
caused issues if the functions were overridden by another module.
|
||
|
- A crash occurred if methods run() and wait() were called concurrently
|
||
|
for an empty tbb::task_group (1736).
|
||
|
- The tachyon example exhibited build problems associated with
|
||
|
bug 554339 on Microsoft* Visual Studio* 2010. Project files were
|
||
|
modified as a partial workaround to overcome the problem. See
|
||
|
http://connect.microsoft.com/VisualStudio/feedback/details/554339.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 3.0 Update 3 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 5003
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 3.0 Update 2 commercial-aligned release):
|
||
|
|
||
|
- cache_aligned_allocator class reworked to use scalable_aligned_malloc.
|
||
|
- Improved performance of count() and equal_range() methods
|
||
|
in concurrent_unordered_map.
|
||
|
- Improved implementation of 64-bit atomic loads and stores on 32-bit
|
||
|
platforms, including compilation with VC 7.1.
|
||
|
- Added implementation of atomic operations on top of OSAtomic API
|
||
|
provided by OS X*.
|
||
|
- Removed gratuitous try/catch blocks surrounding thread function calls
|
||
|
in tbb_thread.
|
||
|
- Xcode* projects were added for sudoku and game_of_life examples.
|
||
|
- Xcode* projects were updated to work without TBB framework.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Fixed a data race in task scheduler destruction that on rare occasion
|
||
|
could result in memory corruption.
|
||
|
- Fixed idle spinning in thread bound filters in tbb::pipeline (1670).
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- MinGW-64 basic support by brsomoza (partially).
|
||
|
- Patch for atomic.h by Andrey Semashev.
|
||
|
- Support for AIX & GCC on PowerPC by Giannis Papadopoulos.
|
||
|
- Various improvements by Raf Schietekat.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 3.0 Update 2 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 5002
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 3.0 Update 1 commercial-aligned release):
|
||
|
|
||
|
- Destructor of tbb::task_group class throws missing_wait exception
|
||
|
if there are tasks running when it is invoked.
|
||
|
- Interoperability layer with Intel Cilk Plus runtime library added
|
||
|
to protect TBB TLS in case of nested usage with Intel Cilk Plus.
|
||
|
- Compilation fix for dependent template names in concurrent_queue.
|
||
|
- Memory allocator code refactored to ease development and maintenance.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Improved interoperability with other Intel software tools on Linux in
|
||
|
case of dynamic replacement of memory allocator (1700)
|
||
|
- Fixed install issues that prevented installation on
|
||
|
Mac OS* X 10.6.4 (1711).
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 3.0 Update 1 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 5000 (forgotten to increment)
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 3.0 commercial-aligned release):
|
||
|
|
||
|
- Decreased memory fragmentation by allocations bigger than 8K.
|
||
|
- Lazily allocate worker threads, to avoid creating unnecessary stacks.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- TBB allocator used much more memory than malloc (1703) - see above.
|
||
|
- Deadlocks happened in some specific initialization scenarios
|
||
|
of the TBB allocator (1701, 1704).
|
||
|
- Regression in enumerable_thread_specific: excessive requirements
|
||
|
for object constructors.
|
||
|
- A bug in construction of parallel_pipeline filters when body instance
|
||
|
was a temporary object.
|
||
|
- Incorrect usage of memory fences on PowerPC and XBOX360 platforms.
|
||
|
- A subtle issue in task group context binding that could result
|
||
|
in cancellation signal being missed by nested task groups.
|
||
|
- Incorrect construction of concurrent_unordered_map if specified
|
||
|
number of buckets is not power of two.
|
||
|
- Broken count() and equal_range() of concurrent_unordered_map.
|
||
|
- Return type of postfix form of operator++ for hash map's iterators.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 3.0 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 5000
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.2 Update 3 commercial-aligned release):
|
||
|
|
||
|
- All open-source-release changes down to TBB 2.2 U3 below
|
||
|
were incorporated into this release.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20100406 open-source release
|
||
|
|
||
|
Changes (w.r.t. 20100310 open-source release):
|
||
|
|
||
|
- Added support for Microsoft* Visual Studio* 2010, including binaries.
|
||
|
- Added a PDF file with recommended Design Patterns for TBB.
|
||
|
- Added parallel_pipeline function and companion classes and functions
|
||
|
that provide a strongly typed lambda-friendly pipeline interface.
|
||
|
- Reworked enumerable_thread_specific to use a custom implementation of
|
||
|
hash map that is more efficient for ETS usage models.
|
||
|
- Added example for class task_group; see examples/task_group/sudoku.
|
||
|
- Removed two examples, as they were long outdated and superceded:
|
||
|
pipeline/text_filter (use pipeline/square);
|
||
|
parallel_while/parallel_preorder (use parallel_do/parallel_preorder).
|
||
|
- PDF documentation updated.
|
||
|
- Other fixes and changes in code, tests, and examples.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- Eliminated build errors with MinGW32.
|
||
|
- Fixed post-build step and other issues in VS projects for examples.
|
||
|
- Fixed discrepancy between scalable_realloc and scalable_msize that
|
||
|
caused crashes with malloc replacement on Windows.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20100310 open-source release
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.2 Update 3 commercial-aligned release):
|
||
|
|
||
|
- Version macros changed in anticipation of a future release.
|
||
|
- Directory structure aligned with Intel(R) C++ Compiler;
|
||
|
now TBB binaries reside in <arch>/<os_key>/[bin|lib]
|
||
|
(in TBB 2.x, it was [bin|lib]/<arch>/<os_key>).
|
||
|
- Visual Studio projects changed for examples: instead of separate set
|
||
|
of files for each VS version, now there is single 'msvs' directory
|
||
|
that contains workspaces for MS C++ compiler (<example>_cl.sln) and
|
||
|
Intel C++ compiler (<example>_icl.sln). Works with VS 2005 and above.
|
||
|
- The name versioning scheme for backward compatibility was improved;
|
||
|
now compatibility-breaking changes are done in a separate namespace.
|
||
|
- Added concurrent_unordered_map implementation based on a prototype
|
||
|
developed in Microsoft for a future version of PPL.
|
||
|
- Added PPL-compatible writer-preference RW lock (reader_writer_lock).
|
||
|
- Added TBB_IMPLEMENT_CPP0X macro to control injection of C++0x names
|
||
|
implemented in TBB into namespace std.
|
||
|
- Added almost-C++0x-compatible std::condition_variable, plus a bunch
|
||
|
of other C++0x classes required by condition_variable.
|
||
|
- With TBB_IMPLEMENT_CPP0X, tbb_thread can be also used as std::thread.
|
||
|
- task.cpp was split into several translation units to structure
|
||
|
TBB scheduler sources layout. Static data layout and library
|
||
|
initialization logic were also updated.
|
||
|
- TBB scheduler reworked to prevent master threads from stealing
|
||
|
work belonging to other masters.
|
||
|
- Class task was extended with enqueue() method, and slightly changed
|
||
|
semantics of methods spawn() and destroy(). For exact semantics,
|
||
|
refer to TBB Reference manual.
|
||
|
- task_group_context now allows for destruction by non-owner threads.
|
||
|
- Added TBB_USE_EXCEPTIONS macro to control use of exceptions in TBB
|
||
|
headers. It turns off (i.e. sets to 0) automatically if specified
|
||
|
compiler options disable exception handling.
|
||
|
- TBB is enabled to run on top of Microsoft's Concurrency Runtime
|
||
|
on Windows* 7 (via our worker dispatcher known as RML).
|
||
|
- Removed old unused busy-waiting code in concurrent_queue.
|
||
|
- Described the advanced build & test options in src/index.html.
|
||
|
- Warning level for GCC raised with -Wextra and a few other options.
|
||
|
- Multiple fixes and improvements in code, tests, examples, and docs.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Xbox support by Roman Lut (Deep Shadows), though further changes are
|
||
|
required to make it working; e.g. post-2.1 entry points are missing.
|
||
|
- "Eventcount" by Dmitry Vyukov evolved into concurrent_monitor,
|
||
|
an internal class used in the implementation of concurrent_queue.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.2 Update 3 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 4003
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.2 Update 2 commercial-aligned release):
|
||
|
|
||
|
- PDF documentation updated.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- concurrent_hash_map compatibility issue exposed on Linux in case
|
||
|
two versions of the container were used by different modules.
|
||
|
- enforce 16 byte stack alignment for consistence with GCC; required
|
||
|
to work correctly with 128-bit variables processed by SSE.
|
||
|
- construct() methods of allocator classes now use global operator new.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.2 Update 2 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 4002
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.2 Update 1 commercial-aligned release):
|
||
|
|
||
|
- parallel_invoke and parallel_for_each now take function objects
|
||
|
by const reference, not by value.
|
||
|
- Building TBB with /MT is supported, to avoid dependency on particular
|
||
|
versions of Visual C++* runtime DLLs. TBB DLLs built with /MT
|
||
|
are located in vc_mt directory.
|
||
|
- Class critical_section introduced.
|
||
|
- Improvements in exception support: new exception classes introduced,
|
||
|
all exceptions are thrown via an out-of-line internal method.
|
||
|
- Improvements and fixes in the TBB allocator and malloc replacement,
|
||
|
including robust memory identification, and more reliable dynamic
|
||
|
function substitution on Windows*.
|
||
|
- Method swap() added to class tbb_thread.
|
||
|
- Methods rehash() and bucket_count() added to concurrent_hash_map.
|
||
|
- Added support for Visual Studio* 2010 Beta2. No special binaries
|
||
|
provided, but CRT-independent DLLs (vc_mt) should work.
|
||
|
- Other fixes and improvements in code, tests, examples, and docs.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- The fix to build 32-bit TBB on Mac OS* X 10.6.
|
||
|
- GCC-based port for SPARC Solaris by Michailo Matijkiw, with use of
|
||
|
earlier work by Raf Schietekat.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 159 - TBB build for PowerPC* running Mac OS* X.
|
||
|
- 160 - IBM* Java segfault if used with TBB allocator.
|
||
|
- crash in concurrent_queue<char> (1616).
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.2 Update 1 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 4001
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.2 commercial-aligned release):
|
||
|
|
||
|
- Incorporates all changes from open-source releases below.
|
||
|
- Documentation was updated.
|
||
|
- TBB scheduler auto-initialization now covers all possible use cases.
|
||
|
- concurrent_queue: made argument types of sizeof used in paddings
|
||
|
consistent with those actually used.
|
||
|
- Memory allocator was improved: supported corner case of user's malloc
|
||
|
calling scalable_malloc (non-Windows), corrected processing of
|
||
|
memory allocation requests during tbb memory allocator startup
|
||
|
(Linux).
|
||
|
- Windows malloc replacement has got better support for static objects.
|
||
|
- In pipeline setups that do not allow actual parallelism, execution
|
||
|
by a single thread is guaranteed, idle spinning eliminated, and
|
||
|
performance improved.
|
||
|
- RML refactoring and clean-up.
|
||
|
- New constructor for concurrent_hash_map allows reserving space for
|
||
|
a number of items.
|
||
|
- Operator delete() added to the TBB exception classes.
|
||
|
- Lambda support was improved in parallel_reduce.
|
||
|
- gcc 4.3 warnings were fixed for concurrent_queue.
|
||
|
- Fixed possible initialization deadlock in modules using TBB entities
|
||
|
during construction of global static objects.
|
||
|
- Copy constructor in concurrent_hash_map was fixed.
|
||
|
- Fixed a couple of rare crashes in the scheduler possible before
|
||
|
in very specific use cases.
|
||
|
- Fixed a rare crash in the TBB allocator running out of memory.
|
||
|
- New tests were implemented, including test_lambda.cpp that checks
|
||
|
support for lambda expressions.
|
||
|
- A few other small changes in code, tests, and documentation.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20090809 open-source release
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.2 commercial-aligned release):
|
||
|
|
||
|
- Fixed known exception safety issues in concurrent_vector.
|
||
|
- Better concurrency of simultaneous grow requests in concurrent_vector.
|
||
|
- TBB allocator further improves performance of large object allocation.
|
||
|
- Problem with source of text relocations was fixed on Linux
|
||
|
- Fixed bugs related to malloc replacement under Windows
|
||
|
- A few other small changes in code and documentation.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.2 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 4000
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.1 U4 commercial-aligned release):
|
||
|
|
||
|
- Incorporates all changes from open-source releases below.
|
||
|
- Architecture folders renamed from em64t to intel64 and from itanium
|
||
|
to ia64.
|
||
|
- Major Interface version changed from 3 to 4. Deprecated interfaces
|
||
|
might be removed in future releases.
|
||
|
- Parallel algorithms that use partitioners have switched to use
|
||
|
the auto_partitioner by default.
|
||
|
- Improved memory allocator performance for allocations bigger than 8K.
|
||
|
- Added new thread-bound filters functionality for pipeline.
|
||
|
- New implementation of concurrent_hash_map that improves performance
|
||
|
significantly.
|
||
|
- A few other small changes in code and documentation.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20090511 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Basic support for MinGW32 development kit.
|
||
|
- Added tbb::zero_allocator class that initializes memory with zeros.
|
||
|
It can be used as an adaptor to any STL-compatible allocator class.
|
||
|
- Added tbb::parallel_for_each template function as alias to parallel_do.
|
||
|
- Added more overloads for tbb::parallel_for.
|
||
|
- Added support for exact exception propagation (can only be used with
|
||
|
compilers that support C++0x std::exception_ptr).
|
||
|
- tbb::atomic template class can be used with enumerations.
|
||
|
- mutex, recursive_mutex, spin_mutex, spin_rw_mutex classes extended
|
||
|
with explicit lock/unlock methods.
|
||
|
- Fixed size() and grow_to_at_least() methods of tbb::concurrent_vector
|
||
|
to provide space allocation guarantees. More methods added for
|
||
|
compatibility with std::vector, including some from C++0x.
|
||
|
- Preview of a lambda-friendly interface for low-level use of tasks.
|
||
|
- scalable_msize function added to the scalable allocator (Windows only).
|
||
|
- Rationalized internal auxiliary functions for spin-waiting and backoff.
|
||
|
- Several tests undergo decent refactoring.
|
||
|
|
||
|
Changes affecting backward compatibility:
|
||
|
|
||
|
- Improvements in concurrent_queue, including limited API changes.
|
||
|
The previous version is deprecated; its functionality is accessible
|
||
|
via methods of the new tbb::concurrent_bounded_queue class.
|
||
|
- grow* and push_back methods of concurrent_vector changed to return
|
||
|
iterators; old semantics is deprecated.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.1 Update 4 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 3016
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.1 U3 commercial-aligned release):
|
||
|
|
||
|
- Added tests for aligned memory allocations and malloc replacement.
|
||
|
- Several improvements for better bundling with Intel(R) C++ Compiler.
|
||
|
- A few other small changes in code and documentaion.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 150 - request to build TBB examples with debug info in release mode.
|
||
|
- backward compatibility issue with concurrent_queue on Windows.
|
||
|
- dependency on VS 2005 SP1 runtime libraries removed.
|
||
|
- compilation of GUI examples under Xcode* 3.1 (1577).
|
||
|
- On Windows, TBB allocator classes can be instantiated with const types
|
||
|
for compatibility with MS implementation of STL containers (1566).
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20090313 open-source release
|
||
|
|
||
|
Changes (w.r.t. 20081109 open-source release):
|
||
|
|
||
|
- Includes all changes introduced in TBB 2.1 Update 2 & Update 3
|
||
|
commercial-aligned releases (see below for details).
|
||
|
- Added tbb::parallel_invoke template function. It runs up to 10
|
||
|
user-defined functions in parallel and waits for them to complete.
|
||
|
- Added a special library providing ability to replace the standard
|
||
|
memory allocation routines in Microsoft* C/C++ RTL (malloc/free,
|
||
|
global new/delete, etc.) with the TBB memory allocator.
|
||
|
Usage details are described in include/tbb/tbbmalloc_proxy.h file.
|
||
|
- Task scheduler switched to use new implementation of its core
|
||
|
functionality (deque based task pool, new structure of arena slots).
|
||
|
- Preview of Microsoft* Visual Studio* 2005 project files for
|
||
|
building the library is available in build/vsproject folder.
|
||
|
- Added tests for aligned memory allocations and malloc replacement.
|
||
|
- Added parallel_for/game_of_life.net example (for Windows only)
|
||
|
showing TBB usage in a .NET application.
|
||
|
- A number of other fixes and improvements to code, tests, makefiles,
|
||
|
examples and documents.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- The same list as in TBB 2.1 Update 4 right above.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.1 Update 3 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 3015
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.1 U2 commercial-aligned release):
|
||
|
|
||
|
- Added support for aligned allocations to the TBB memory allocator.
|
||
|
- Added a special library to use with LD_PRELOAD on Linux* in order to
|
||
|
replace the standard memory allocation routines in C/C++ with the
|
||
|
TBB memory allocator.
|
||
|
- Added null_mutex and null_rw_mutex: no-op classes interface-compliant
|
||
|
to other TBB mutexes.
|
||
|
- Improved performance of parallel_sort, to close most of the serial gap
|
||
|
with std::sort, and beat it on 2 and more cores.
|
||
|
- A few other small changes.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- the problem where parallel_for hanged after exception throw
|
||
|
if affinity_partitioner was used (1556).
|
||
|
- get rid of VS warnings about mbstowcs deprecation (1560),
|
||
|
as well as some other warnings.
|
||
|
- operator== for concurrent_vector::iterator fixed to work correctly
|
||
|
with different vector instances.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.1 Update 2 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 3014
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.1 U1 commercial-aligned release):
|
||
|
|
||
|
- Incorporates all open-source-release changes down to TBB 2.1 U1,
|
||
|
except for:
|
||
|
- 20081019 addition of enumerable_thread_specific;
|
||
|
- Warning level for Microsoft* Visual C++* compiler raised to /W4 /Wp64;
|
||
|
warnings found on this level were cleaned or suppressed.
|
||
|
- Added TBB_runtime_interface_version API function.
|
||
|
- Added new example: pipeline/square.
|
||
|
- Added exception handling and cancellation support
|
||
|
for parallel_do and pipeline.
|
||
|
- Added copy constructor and [begin,end) constructor to concurrent_queue.
|
||
|
- Added some support for beta version of Intel(R) Parallel Amplifier.
|
||
|
- Added scripts to set environment for cross-compilation of 32-bit
|
||
|
applications on 64-bit Linux with Intel(R) C++ Compiler.
|
||
|
- Fixed semantics of concurrent_vector::clear() to not deallocate
|
||
|
internal arrays. Fixed compact() to perform such deallocation later.
|
||
|
- Fixed the issue with atomic<T*> when T is incomplete type.
|
||
|
- Improved support for PowerPC* Macintosh*, including the fix
|
||
|
for a bug in masked compare-and-swap reported by a customer.
|
||
|
- As usual, a number of other improvements everywhere.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20081109 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Added new serial out of order filter for tbb::pipeline.
|
||
|
- Fixed the issue with atomic<T*>::operator= reported at the forum.
|
||
|
- Fixed the issue with using tbb::task::self() in task destructor
|
||
|
reported at the forum.
|
||
|
- A number of other improvements to code, tests, makefiles, examples
|
||
|
and documents.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
- Changes in the memory allocator were partially integrated.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20081019 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Introduced enumerable_thread_specific<T>. This new class provides a
|
||
|
wrapper around native thread local storage as well as iterators and
|
||
|
ranges for accessing the thread local copies (1533).
|
||
|
- Improved support for Intel(R) Threading Analysis Tools
|
||
|
on Intel(R) 64 architecture.
|
||
|
- Dependency from Microsoft* CRT was integrated to the libraries using
|
||
|
manifests, to avoid issues if called from code that uses different
|
||
|
version of Visual C++* runtime than the library.
|
||
|
- Introduced new defines TBB_USE_ASSERT, TBB_USE_DEBUG,
|
||
|
TBB_USE_PERFORMANCE_WARNINGS, TBB_USE_THREADING_TOOLS.
|
||
|
- A number of other improvements to code, tests, makefiles, examples
|
||
|
and documents.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- linker optimization: /incremental:no .
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080925 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Same fix for a memory leak in the memory allocator as in TBB 2.1 U1.
|
||
|
- Improved support for lambda functions.
|
||
|
- Fixed more concurrent_queue issues reported at the forum.
|
||
|
- A number of other improvements to code, tests, makefiles, examples
|
||
|
and documents.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.1 Update 1 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 3013
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.1 commercial-aligned release):
|
||
|
|
||
|
- Fixed small memory leak in the memory allocator.
|
||
|
- Incorporates all open-source-release changes since TBB 2.1,
|
||
|
except for:
|
||
|
- 20080825 changes for parallel_do;
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080825 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Added exception handling and cancellation support for parallel_do.
|
||
|
- Added default HashCompare template argument for concurrent_hash_map.
|
||
|
- Fixed concurrent_queue.clear() issues due to incorrect assumption
|
||
|
about clear() being private method.
|
||
|
- Added the possibility to use TBB in applications that change
|
||
|
default calling conventions (Windows* only).
|
||
|
- Many improvements to code, tests, examples, makefiles and documents.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 120, 130 - memset declaration missed in concurrent_hash_map.h
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080724 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Inline assembly for atomic operations improved for gcc 4.3
|
||
|
- A few more improvements to the code.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080709 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- operator=() was added to the tbb_thread class according to
|
||
|
the current working draft for std::thread.
|
||
|
- Recognizing SPARC* in makefiles for Linux* and Sun Solaris*.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 127 - concurrent_hash_map::range fixed to split correctly.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- fix_set_midpoint.diff by jyasskin
|
||
|
- SPARC* support in makefiles by Raf Schietekat
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080622 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Fixed a hang that rarely happened on Linux
|
||
|
during deinitialization of the TBB scheduler.
|
||
|
- Improved support for Intel(R) Thread Checker.
|
||
|
- A few more improvements to the code.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.1 commercial-aligned release
|
||
|
TBB_INTERFACE_VERSION == 3011
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.0 U3 commercial-aligned release):
|
||
|
|
||
|
- All open-source-release changes down to, and including, TBB 2.0 below,
|
||
|
were incorporated into this release.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080605 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Explicit control of exported symbols by version scripts added on Linux.
|
||
|
- Interfaces polished for exception handling & algorithm cancellation.
|
||
|
- Cache behavior improvements in the scalable allocator.
|
||
|
- Improvements in text_filter, polygon_overlay, and other examples.
|
||
|
- A lot of other stability improvements in code, tests, and makefiles.
|
||
|
- First release where binary packages include headers/docs/examples, so
|
||
|
binary packages are now self-sufficient for using TBB.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- atomics patch (partially).
|
||
|
- tick_count warning patch.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 118 - fix for boost compatibility.
|
||
|
- 123 - fix for tbb_machine.h.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080512 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Fixed a problem with backward binary compatibility
|
||
|
of debug Linux builds.
|
||
|
- Sun* Studio* support added.
|
||
|
- soname support added on Linux via linker script. To restore backward
|
||
|
binary compatibility, *.so -> *.so.2 softlinks should be created.
|
||
|
- concurrent_hash_map improvements - added few new forms of insert()
|
||
|
method and fixed precondition and guarantees of erase() methods.
|
||
|
Added runtime warning reporting about bad hash function used for
|
||
|
the container. Various improvements for performance and concurrency.
|
||
|
- Cancellation mechanism reworked so that it does not hurt scalability.
|
||
|
- Algorithm parallel_do reworked. Requirement for Body::argument_type
|
||
|
definition removed, and work item argument type can be arbitrarily
|
||
|
cv-qualified.
|
||
|
- polygon_overlay example added.
|
||
|
- A few more improvements to code, tests, examples and Makefiles.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Soname support patch for Bugzilla #112.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 112 - fix for soname support.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.0 U3 commercial-aligned release (package 017, April 20, 2008)
|
||
|
|
||
|
Corresponds to commercial 019 (for Linux*, 020; for Mac OS* X, 018)
|
||
|
packages.
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.0 U2 commercial-aligned release):
|
||
|
|
||
|
- Does not contain open-source-release changes below; this release is
|
||
|
only a minor update of TBB 2.0 U2.
|
||
|
- Removed spin-waiting in pipeline and concurrent_queue.
|
||
|
- A few more small bug fixes from open-source releases below.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080408 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- count_strings example reworked: new word generator implemented, hash
|
||
|
function replaced, and tbb_allocator is used with std::string class.
|
||
|
- Static methods of spin_rw_mutex were replaced by normal member
|
||
|
functions, and the class name was versioned.
|
||
|
- tacheon example was renamed to tachyon.
|
||
|
- Improved support for Intel(R) Thread Checker.
|
||
|
- A few more minor improvements.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Two sets of Sun patches for IA Solaris support.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080402 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Exception handling and cancellation support for tasks and algorithms
|
||
|
fully enabled.
|
||
|
- Exception safety guaranties defined and fixed for all concurrent
|
||
|
containers.
|
||
|
- User-defined memory allocator support added to all concurrent
|
||
|
containers.
|
||
|
- Performance improvement of concurrent_hash_map, spin_rw_mutex.
|
||
|
- Critical fix for a rare race condition during scheduler
|
||
|
initialization/de-initialization.
|
||
|
- New methods added for concurrent containers to be closer to STL,
|
||
|
as well as automatic filters removal from pipeline
|
||
|
and __TBB_AtomicAND function.
|
||
|
- The volatile keyword dropped from where it is not really needed.
|
||
|
- A few more minor improvements.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080319 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Support for gcc version 4.3 was added.
|
||
|
- tbb_thread class, near compatible with std::thread expected in C++0x,
|
||
|
was added.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 116 - fix for compilation issues with gcc version 4.2.1.
|
||
|
- 120 - fix for compilation issues with gcc version 4.3.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080311 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- An enumerator added for pipeline filter types (serial vs. parallel).
|
||
|
- New task_scheduler_observer class introduced, to observe when
|
||
|
threads start and finish interacting with the TBB task scheduler.
|
||
|
- task_scheduler_init reverted to not use internal versioned class;
|
||
|
binary compatibility guaranteed with stable releases only.
|
||
|
- Various improvements to code, tests, examples and Makefiles.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080304 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Task-to-thread affinity support, previously kept under a macro,
|
||
|
now fully legalized.
|
||
|
- Work-in-progress on cache_aligned_allocator improvements.
|
||
|
- Pipeline really supports parallel input stage; it's no more serialized.
|
||
|
- Various improvements to code, tests, examples and Makefiles.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 119 - fix for scalable_malloc sometimes failing to return a big block.
|
||
|
- TR575 - fixed a deadlock occurring on Windows in startup/shutdown
|
||
|
under some conditions.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080226 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Introduced tbb_allocator to select between standard allocator and
|
||
|
tbb::scalable_allocator when available.
|
||
|
- Removed spin-waiting in pipeline and concurrent_queue.
|
||
|
- Improved performance of concurrent_hash_map by using tbb_allocator.
|
||
|
- Improved support for Intel(R) Thread Checker.
|
||
|
- Various improvements to code, tests, examples and Makefiles.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.0 U2 commercial-aligned release (package 017, February 14, 2008)
|
||
|
|
||
|
Corresponds to commercial 017 (for Linux*, 018; for Mac OS* X, 016)
|
||
|
packages.
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.0 U1 commercial-aligned release):
|
||
|
|
||
|
- Does not contain open-source-release changes below; this release is
|
||
|
only a minor update of TBB 2.0 U1.
|
||
|
- Add support for Microsoft* Visual Studio* 2008, including binary
|
||
|
libraries and VS2008 projects for examples.
|
||
|
- Use SwitchToThread() not Sleep() to yield threads on Windows*.
|
||
|
- Enhancements to Doxygen-readable comments in source code.
|
||
|
- A few more small bug fixes from open-source releases below.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- TR569 - Memory leak in concurrent_queue.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080207 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Improvements and minor fixes in VS2008 projects for examples.
|
||
|
- Improvements in code for gating worker threads that wait for work,
|
||
|
previously consolidated under #if IMPROVED_GATING, now legalized.
|
||
|
- Cosmetic changes in code, examples, tests.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 113 - Iterators and ranges should be convertible to their const
|
||
|
counterparts.
|
||
|
- TR569 - Memory leak in concurrent_queue.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080122 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Updated examples/parallel_for/seismic to improve the visuals and to
|
||
|
use the affinity_partitioner (20071127 and forward) for better
|
||
|
performance.
|
||
|
- Minor improvements to unittests and performance tests.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20080115 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Cleanup, simplifications and enhancements to the Makefiles for
|
||
|
building the libraries (see build/index.html for high-level
|
||
|
changes) and the examples.
|
||
|
- Use SwitchToThread() not Sleep() to yield threads on Windows*.
|
||
|
- Engineering work-in-progress on exception safety/support.
|
||
|
- Engineering work-in-progress on affinity_partitioner for
|
||
|
parallel_reduce.
|
||
|
- Engineering work-in-progress on improved gating for worker threads
|
||
|
(idle workers now block in the OS instead of spinning).
|
||
|
- Enhancements to Doxygen-readable comments in source code.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 102 - Support for parallel build with gmake -j
|
||
|
- 114 - /Wp64 build warning on Windows*.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20071218 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Full support for Microsoft* Visual Studio* 2008 in open-source.
|
||
|
Binaries for vc9/ will be available in future stable releases.
|
||
|
- New recursive_mutex class.
|
||
|
- Full support for 32-bit PowerMac including export files for builds.
|
||
|
- Improvements to parallel_do.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20071206 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Support for Microsoft* Visual Studio* 2008 in building libraries
|
||
|
from source as well as in vc9/ projects for examples.
|
||
|
- Small fixes to the affinity_partitioner first introduced in 20071127.
|
||
|
- Small fixes to the thread-stack size hook first introduced in 20071127.
|
||
|
- Engineering work in progress on concurrent_vector.
|
||
|
- Engineering work in progress on exception behavior.
|
||
|
- Unittest improvements.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20071127 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- Task-to-thread affinity support (affinity partitioner) first appears.
|
||
|
- More work on concurrent_vector.
|
||
|
- New parallel_do algorithm (function-style version of parallel while)
|
||
|
and parallel_do/parallel_preorder example.
|
||
|
- New task_scheduler_init() hooks for getting default_num_threads() and
|
||
|
for setting thread stack size.
|
||
|
- Support for weak memory consistency models in the code base.
|
||
|
- Futex usage in the task scheduler (Linux).
|
||
|
- Started adding 32-bit PowerMac support.
|
||
|
- Intel(R) 9.1 compilers are now the base supported Intel(R) compiler
|
||
|
version.
|
||
|
- TBB libraries added to link line automatically on Microsoft Windows*
|
||
|
systems via #pragma comment linker directives.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- FreeBSD platform support patches.
|
||
|
- AIX weak memory model patch.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 108 - Removed broken affinity.h reference.
|
||
|
- 101 - Does not build on Debian Lenny (replaced arch with uname -m).
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20071030 open-source release
|
||
|
|
||
|
Changes (w.r.t. previous open-source release):
|
||
|
|
||
|
- More work on concurrent_vector.
|
||
|
- Better support for building with -Wall -Werror (or not) as desired.
|
||
|
- A few fixes to eliminate extraneous warnings.
|
||
|
- Begin introduction of versioning hooks so that the internal/API
|
||
|
version is tracked via TBB_INTERFACE_VERSION. The newest binary
|
||
|
libraries should always work with previously-compiled code when-
|
||
|
ever possible.
|
||
|
- Engineering work in progress on using futex inside the mutexes (Linux).
|
||
|
- Engineering work in progress on exception behavior.
|
||
|
- Engineering work in progress on a new parallel_do algorithm.
|
||
|
- Unittest improvements.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20070927 open-source release
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.0 U1 commercial-aligned release):
|
||
|
|
||
|
- Minor update to TBB 2.0 U1 below.
|
||
|
- Begin introduction of new concurrent_vector interfaces not released
|
||
|
with TBB 2.0 U1.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.0 U1 commercial-aligned release (package 014, October 1, 2007)
|
||
|
|
||
|
Corresponds to commercial 014 (for Linux*, 016) packages.
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 2.0 commercial-aligned release):
|
||
|
|
||
|
- All open-source-release changes down to, and including, TBB 2.0
|
||
|
below, were incorporated into this release.
|
||
|
- Made a number of changes to the officially supported OS list:
|
||
|
Added Linux* OSs:
|
||
|
Asianux* 3, Debian* 4.0, Fedora Core* 6, Fedora* 7,
|
||
|
Turbo Linux* 11, Ubuntu* 7.04;
|
||
|
Dropped Linux* OSs:
|
||
|
Asianux* 2, Fedora Core* 4, Haansoft* Linux 2006 Server,
|
||
|
Mandriva/Mandrake* 10.1, Miracle Linux* 4.0,
|
||
|
Red Flag* DC Server 5.0;
|
||
|
Only Mac OS* X 10.4.9 (and forward) and Xcode* tool suite 2.4.1 (and
|
||
|
forward) are now supported.
|
||
|
- Commercial installers on Linux* fixed to recommend the correct
|
||
|
binaries to use in more cases, with less unnecessary warnings.
|
||
|
- Changes to eliminate spurious build warnings.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Two small header guard macro patches; it also fixed bug #94.
|
||
|
- New blocked_range3d class.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 93 - Removed misleading comments in task.h.
|
||
|
- 94 - See above.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20070815 open-source release
|
||
|
|
||
|
Changes:
|
||
|
|
||
|
- Changes to eliminate spurious build warnings.
|
||
|
- Engineering work in progress on concurrent_vector allocator behavior.
|
||
|
- Added hooks to use the Intel(R) compiler code coverage tools.
|
||
|
|
||
|
Open-source contributions integrated:
|
||
|
|
||
|
- Mac OS* X build warning patch.
|
||
|
|
||
|
Bugs fixed:
|
||
|
|
||
|
- 88 - Fixed TBB compilation errors if both VS2005 and Windows SDK are
|
||
|
installed.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
20070719 open-source release
|
||
|
|
||
|
Changes:
|
||
|
|
||
|
- Minor update to TBB 2.0 commercial-aligned release below.
|
||
|
- Changes to eliminate spurious build warnings.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 2.0 commercial-aligned release (package 010, July 19, 2007)
|
||
|
|
||
|
Corresponds to commercial 010 (for Linux*, 012) packages.
|
||
|
|
||
|
- TBB open-source debut release.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 1.1 commercial release (April 10, 2007)
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 1.0 commercial release):
|
||
|
|
||
|
- auto_partitioner which offered an automatic alternative to specifying
|
||
|
a grain size parameter to estimate the best granularity for tasks.
|
||
|
- The release was added to the Intel(R) C++ Compiler 10.0 Pro.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 1.0 Update 2 commercial release
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 1.0 Update 1 commercial release):
|
||
|
|
||
|
- Mac OS* X 64-bit support added.
|
||
|
- Source packages for commercial releases introduced.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 1.0 Update 1 commercial-aligned release
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 1.0 commercial release):
|
||
|
|
||
|
- Fix for critical package issue on Mac OS* X.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 1.0 commercial release (August 29, 2006)
|
||
|
|
||
|
Changes (w.r.t. Intel TBB 1.0 beta commercial release):
|
||
|
|
||
|
- New namespace (and compatibility headers for old namespace).
|
||
|
Namespaces are tbb and tbb::internal and all classes are in the
|
||
|
underscore_style not the WindowsStyle.
|
||
|
- New class: scalable_allocator (and cache_aligned_allocator using that
|
||
|
if it exists).
|
||
|
- Added parallel_for/tacheon example.
|
||
|
- Removed C-style casts from headers for better C++ compliance.
|
||
|
- Bug fixes.
|
||
|
- Documentation improvements.
|
||
|
- Improved performance of the concurrent_hash_map class.
|
||
|
- Upgraded parallel_sort() to support STL-style random-access iterators
|
||
|
instead of just pointers.
|
||
|
- The Windows vs7_1 directories renamed to vs7.1 in examples.
|
||
|
- New class: spin version of reader-writer lock.
|
||
|
- Added push_back() interface to concurrent_vector().
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel TBB 1.0 beta commercial release
|
||
|
|
||
|
Initial release.
|
||
|
|
||
|
Features / APIs:
|
||
|
|
||
|
- Concurrent containers: ConcurrentHashTable, ConcurrentVector,
|
||
|
ConcurrentQueue.
|
||
|
- Parallel algorithms: ParallelFor, ParallelReduce, ParallelScan,
|
||
|
ParallelWhile, Pipeline, ParallelSort.
|
||
|
- Support: AlignedSpace, BlockedRange (i.e., 1D), BlockedRange2D
|
||
|
- Task scheduler with multi-master support.
|
||
|
- Atomics: read, write, fetch-and-store, fetch-and-add, compare-and-swap.
|
||
|
- Locks: spin, reader-writer, queuing, OS-wrapper.
|
||
|
- Memory allocation: STL-style memory allocator that avoids false
|
||
|
sharing.
|
||
|
- Timers.
|
||
|
|
||
|
Tools Support:
|
||
|
- Intel(R) Thread Checker 3.0.
|
||
|
- Intel(R) Thread Profiler 3.0.
|
||
|
|
||
|
Documentation:
|
||
|
- First Use Documents: README.txt, INSTALL.txt, Release_Notes.txt,
|
||
|
Doc_Index.html, Getting_Started.pdf, Tutorial.pdf, Reference.pdf.
|
||
|
- Class hierarchy HTML pages (Doxygen).
|
||
|
- Tree of index.html pages for navigating the installed package, esp.
|
||
|
for the examples.
|
||
|
|
||
|
Examples:
|
||
|
- One for each of these TBB features: ConcurrentHashTable, ParallelFor,
|
||
|
ParallelReduce, ParallelWhile, Pipeline, Task.
|
||
|
- Live copies of examples from Getting_Started.pdf.
|
||
|
- TestAll example that exercises every class and header in the package
|
||
|
(i.e., a "liveness test").
|
||
|
- Compilers: see Release_Notes.txt.
|
||
|
- APIs: OpenMP, WinThreads, Pthreads.
|
||
|
|
||
|
Packaging:
|
||
|
- Package for Windows installs IA-32 and EM64T bits.
|
||
|
- Package for Linux installs IA-32, EM64T and IPF bits.
|
||
|
- Package for Mac OS* X installs IA-32 bits.
|
||
|
- All packages support Intel(R) software setup assistant (ISSA) and
|
||
|
install-time FLEXlm license checking.
|
||
|
- ISSA support allows license file to be specified directly in case of
|
||
|
no Internet connection or problems with IRC or serial #s.
|
||
|
- Linux installer allows root or non-root, RPM or non-RPM installs.
|
||
|
- FLEXlm license servers (for those who need floating/counted licenses)
|
||
|
are provided separately on Intel(R) Premier.
|
||
|
|
||
|
------------------------------------------------------------------------
|
||
|
Intel and Cilk are registered trademarks or trademarks of Intel Corporation or its
|
||
|
subsidiaries in the United States and other countries.
|
||
|
|
||
|
* Other names and brands may be claimed as the property of others.
|