Each DSP block contains two slices, and each slice contains multiple
MULT18X18D and ALU54B units. Each unit configures each register to use
any of CLK0/1/2/3, CE0/1/2/3, and RST0/1/2/3 ports, and the ports are
connected per unit (so for example, two MULTs in the same block could
connect their CLK0s to different external signals). However, the
hardware only has one actual port per block, so it's required that
all CLK0 signals within a block are the same.
Because the packer is in general allowed to combine two unrelated units
into one block, it may end up combining units that use different signals
for the same port, which would eventually have caused a router failure.
This commit adds validity checks which ensure only unique signals are
used per block, and adds remapping so that conflicting signals are
automatically reassigned when possible and required.
This makes predictDelay be based on an arbitrary belpin pair rather
than a arc of a net based on cell placement. This way 'what-if'
decisions can be evaluated without actually changing placement;
potentially useful for parallel placement.
A new helper predictArcDelay behaves like the old predictDelay to
minimise the impact on existing passes; only arches need be updated.
Signed-off-by: gatecat <gatecat@ds0.me>
This replaces the arch-specific DelayInfo structure with new DelayPair
(min/max only) and DelayQuad (min/max for both rise and fall) structures
that form part of common code.
This further reduces the amount of arch-specific code; and also provides
useful data structures for timing analysis which will need to delay
with pairs/quads of delays as it is improved.
While there may be a small performance cost to arches that didn't
separate the rise/fall cases (arches that aren't currently separating
the min/max cases just need to be fixed...) in DelayInfo, my expectation
is that inlining will mean this doesn't make much difference.
Signed-off-by: gatecat <gatecat@ds0.me>
This makes the difference clearer between the general arch API that
everyone must implement; and helper functions specific to one arch.
Signed-off-by: D. Shah <dave@ds0.me>
This is a complete implementation of IdStringList for ECP5; excluding
the GUI (which you will have to disable for it to build).
Signed-off-by: D. Shah <dave@ds0.me>
This uses the new IdStringList API to store bel names for the ECP5. Note
that other arches and the GUI do not yet build with this
proof-of-concept patch.
getBelByName still uses the old implementation and could be more
efficiently implemented with further development.
Signed-off-by: D. Shah <dave@ds0.me>
This replaces RelPtrs and a separate length field with a Rust-style
slice containing both a pointer and a length; with bounds checking
always enforced.
Thus iterating over these structures is both cleaner and safer.
Signed-off-by: D. Shah <dave@ds0.me>
This involves very few changes, all typical to WASM ports:
* WASM doesn't currently support threads or atomics so those are
disabled.
* WASM doesn't currently support exceptions so the exception
machinery is stubbed out.
* WASM doesn't (and can't) have mmap(), so an emulation library is
used. That library currently doesn't support MAP_SHARED flags,
so MAP_PRIVATE is used instead.
There is also an update to bring ECP5 bbasm CMake rules to parity
with iCE40 ones, since although it is possible to embed chipdb into
nextpnr on WASM, a 200 MB WASM file has very few practical uses.
The README is not updated and there is no included toolchain file
because at the moment it's not possible to build nextpnr with
upstream boost and wasi-libc. Boost requires a patch (merged, will
be available in boost 1.74.0), wasi-libc requires a few unmerged
patches.
If the REG_INPUTA_CLK and REG_INPUTB_CLK values are set, then we should
use the faster setup/hold timings for the 18x8 multiplier.
Similarly, check the value of REG_OUTPUT_CLK for whether or not to use
faster timings for the output.
This is based on how I currently understand the registers to work - if
anyone knows the actual rules for when each timing applies please do
chime in to correct this implementation if necessary.
Along the same lines, this PR does not address the case when the
pipeline registers are enabled, since it is not clear to me how exactly
that affects the timing.