hackrf

Author	SHA1	Message	Date
Martin Ling	03551cb1fd	Detect whether the M0 missed its deadline. Counter-intuitively, this actually saves us two cycles because we unroll the first iteration of the loop that spins on the interrupt flag, saving a branch in the case that the flag is clear the first time.	2024-11-26 19:34:29 +00:00
Martin Ling	d21f01f7b4	In conditional branch table, list one destination per line, in order.	2024-11-26 19:04:42 +00:00
Steven A. Falco	7dbf6d65b6	Make definition of "prev" consistent	2022-11-30 16:53:11 -05:00
Michael Ossmann	06b9d7bee0	Clean up source code copyright notices.	2022-09-23 14:46:52 -04:00
Martin Ling	ad3216435a	Fix overlapping register allocations.	2022-02-28 23:02:34 +00:00
Martin Ling	f3633e285f	Replace direct setting of M0 mode with a request/ack mechanism. This change avoids various possible races in which an autonomous mode change by the M0 might clobber a mode change made from the M4, as well as related races on other state fields that can be written by the M4. The previous mode field is replaced by two separate ones: - active_mode, which is written only by the M0, and indicates the current operating mode. - requested_mode, which is written by the M4 to request a change. This field includes both the requested mode, and a flag bit. The M4 writes the field with the flag bit set, and must then wait for the M0 to signal completion of the request by clearing the flag bit. Whilst the M4 is blocked waiting for the flag bit to be cleared, the M0 can safely make all the required changes to the state that are needed for the transition to the requested mode. Once the transition is complete, the M0 clears the flag bit and the M4 continues execution. Request handling is implemented in the idle loop. To handle requests, mode-specific loops simply need to check the request flag and branch to idle if it is set. A request from the M4 to change modes will always require passing through the idle loop, and is not subject to timing guarantees. Only transitions made autonomously by the M0 have guaranteed timing constraints. The work previously done in reset_counts is now implemented as part of the request handling, so the tx_start, rx_start and wait_start labels are no longer required. An extra two cycles are required in the TX shortfall path because we must now load the active mode to check whether we are in TX_START. Two cycles are saved in the normal TX path because updating the active mode to TX_RUN can now be done without checking the previous value.	2022-02-13 17:53:34 +00:00
Martin Ling	137f2481e5	Make an error code available when a shortfall limit is hit. Previously, finding the M0 in IDLE mode was ambiguous; it could indicate either a normal outcome, or a shortfall limit having being hit. To disambiguate, we add an error field to the M0 state. The errors currently possible are an RX timeout or a TX timeout, both of which can be obtained efficiently from the current operating mode due to the values used. This adds 3 cycles to both shortfall paths, in order to shift down the mode to obtain the error code, and store it to the M0 state.	2022-02-13 17:53:34 +00:00
Martin Ling	8bd3745253	Add some additional commentary.	2022-02-13 17:53:34 +00:00
Martin Ling	cca7320fe4	Add a wait mode for the M0. In wait mode, the byte counter is advanced, but no SGPIO read/writes are done. This mode is intended to be used for implementing timed operations.	2022-02-13 16:46:12 +00:00
Martin Ling	3618a5352f	Add a counter threshold at which the M0 will change to a new mode. This lays the groundwork for implementing timed operations (#86). The M0 can be configured to automatically change modes when its byte count reaches a specific value. Checking the counter against the threshold and dispatching to the next mode is handled by a new `jump_next_mode` macro, which replaces the unconditional branches back to the start of the TX and RX loops. Making this change work requires some rearrangement of the code, such that the destinations of all conditional branch instructions are within reach. These branch instructions (`b[cond] label`) have a range of -256 to +254 bytes from the current program counter. For this reason, the TX shortfall handling is moved earlier in the file, and branches in the idle loop are restructured to use an unconditional branch to rx_start, which is furthest away. The additional code for switching modes adds 9 cycles to the normal RX path, and 10 to the TX path (the difference is because the dispatch in `jump_next_mode` is optimised for the longer RX path).	2022-02-13 16:46:12 +00:00
Martin Ling	7124b7192b	Roll back shortfall stats if switched to idle in a shortfall. During shutdown of TX or RX, the host may stop supplying or retrieving sample data some time before a stop request causes the M0 to be set back to idle mode. This makes it common for a spurious shortfall to occur during shutdown, giving the misleading impression that there has been a throughput problem. In fact, the final shortfall is simply an artifact. This commit detects when this happens, and excludes the spurious shortfall from the stats. To implement this, we back up the shortfall stats whenever a new shortfall begins. If the new shortfall later turns out to be spurious, as indicated by a transition to IDLE while it is ongoing, then we roll back the stats to their previous values. We actually only need to back up previous longest shortfall length. To get a previous shortfall count, can simply to subtract one from the current shortfall count. This change adds four cycles to the two shortfall paths - a load and store to back up the previous longest shortfall length.	2022-02-13 16:46:12 +00:00
Martin Ling	a5e1521535	Don't update buffer pointer until after checking for shortfall. The buffer pointer is not needed in the shortfall paths. Moving this update after the shortfall checks saves 3 cycles in each shortfall path.	2022-02-13 16:46:12 +00:00
Martin Ling	0e99419be2	Don't load M0 byte count from memory. This count is only written by the M0, so there's no need to reload it when the current value is already retained in a register. Removing this load saves two cycles in all code paths.	2022-02-13 16:46:12 +00:00
Martin Ling	4e205994e3	Use separate loops for RX and TX modes. Using our newly-defined macros, it's now straightforward to write separate loops for RX and TX, with the idle loop dispatching to them when a new mode setting is written by the M4. This saves some cycles by reducing branches needed within each loop, and makes it simpler to add new modes. For macros which use internal labels, a name parameter is added. This parameter is prefixed to the labels used, so that each mode's use of that macro produces its own label names. Similarly, where branches were taken in the handle_shortfall macro to the "loop" label, these are replaced with the appropriate tx_loop or rx_loop label. The syntax `\name\()_suffix` is necessary to perform concatenation in the GNU assembler.	2022-02-13 16:46:12 +00:00
Martin Ling	f08e0c17bf	Use new macros in M0 code. This commit is separate from the previous one which adds the macros, in order to make the diffs easier to read.	2022-02-13 16:46:12 +00:00
Martin Ling	9d570cb558	Add macro versions of key parts of M0 code. This commit is separate from the following one which uses the macros, in order to make the diffs easier to read.	2022-02-13 16:46:12 +00:00
Martin Ling	00b5ed7d62	Add an M0 TX_START mode, in which zeroes are sent until data is ready. In TX_START mode, a lack of data to send is not treated as a shortfall. Zeroes are written to SGPIO, but no shortfall is recorded in the stats. Using this mode helps avoid spurious shortfalls at startup. As soon as there is data to transmit, the M0 switches to TX_RUN mode. This change adds five cycles to the normal TX path, in order to check for TX_START mode before sending data, and to switch to TX_RUN in that case. It also adds two cycles to the TX shortfall path, to check for TX_START mode and skip shortfall processing in that mode. Note the allocation of r3 to store the mode setting, such that this value is still available after the tx_zeros routine.	2022-02-13 16:46:12 +00:00
Martin Ling	f0bc6eda30	Add a shortfall length limit. This limit allows implementing a timeout: if a TX underrun or RX overrun continues for the specified number of bytes, the M0 will revert to idle. A setting of zero disables the limit. This change adds 5 cycles to the TX & RX shortfall paths, to check if a limit is set and to check the shortfall length against the limit.	2022-02-13 16:46:12 +00:00
Martin Ling	2c86f493d9	Keep track of longest shortfall. This adds six cycles to the TX and RX shortfall paths.	2022-02-13 16:46:12 +00:00
Martin Ling	a7bd1e3ede	Keep count of number of shortfalls. To enable this, we keep a count of the current shortfall length. Each time an SGPIO read/write cannot be completed due to a shortfall, we increase this length. Each time an SGPIO read/write is completed successfully, we reset the shortfall length to zero. When a shortfall occurs and the existing shortfall length is zero, this indicates a new shortfall, and the shortfall count is incremented. This change adds one cycle to the normal RX & TX paths, to zero the shortfall count. To enable this to be done in a single cycle, we keep a zero handy in a high register. The extra accounting adds 10 cycles to the TX and RX shortfall paths, plus an additional 3 cycles to the RX shortfall path since there are now two branches involved: one to the shortfall handler, and another back to the main loop.	2022-02-13 16:46:12 +00:00
Martin Ling	0f3069ee5e	Move resetting of byte counts to the M0. Previously, these counts were zeroed by the M4 when leaving the OFF transceiver mode. Instead, do this on the M0 at the point where the M0 leaves IDLE mode. This avoids a potential race in which the M4 zeroes the M0 count after the M0 has already started incrementing it.	2022-02-13 16:46:12 +00:00
Martin Ling	32c725dd61	Add an idle mode for the M0. In the idle mode, the M0 simply waits for a different mode to be set. No SGPIO access is done. One extra cycle is added to both TX code paths, to check whether the M0 should return to the idle loop based on the mode setting. The RX paths are unaffected as the branch to RX is handled first.	2022-02-13 16:46:12 +00:00
Martin Ling	5b50b2dfac	Replace TX flag with a mode setting. This is to let us start adding new operatin modes for the M0.	2022-02-13 16:46:12 +00:00
Martin Ling	c0d0cd2a1d	Check for sufficient bytes, or space in buffer, before proceeding. In TX, check if there are sufficient bytes in the buffer to write a block to SGPIO. If not, write zeros to SGPIO instead. In RX, check if there is sufficent space in the buffer to store a block read from SGPIO. If not, do nothing, which discards the data. In both of these shortfall cases, the M0 count is not incremented. This ensures that in TX, old data is never repeated. The M0 will not resume writing TX samples to SGPIO until the M4 count advances, indicating new data being ready in the buffer. This fixes bug #180. Similarly, in RX, old data is never overwritten. The M0 will not resume writing RX samples to the buffer until the M4 count advances, indicating new space being available in the buffer.	2022-02-13 16:46:12 +00:00
Martin Ling	79853d2b28	Add a second counter to keep track of bytes transferred by the M4. With both counters in place, the number of bytes in the buffer is now indicated by the difference between the M0 and M4 counts. The M4 count needs to be increased whenever the M4 produces or consumes data in the USB bulk buffer, so that the two counts remain correctly synchronised. There are three places where this is done: 1. When a USB bulk transfer in or out of the buffer completes, the count is increased by the number of bytes transferred. This is the most common case. 2. At TX startup, the M4 effectively sends the M0 16K of zeroes to transmit, before the first host-provided data. This is done by zeroing the whole 32K buffer area, and then setting up the first bulk transfer to write to the second 16K, whilst the M0 begins transmission of the first 16K. The count is therefore increased by 16K during TX startup, to account for the initial 16K of zeros. 3. In sweep mode, some data is discarded. When this is done, the count is incremented by the size of the discarded data. The USB IRQ is masked whilst doing this, since a read-modify-write is required, and the bulk transfer completion callback may be called at any point, which also increases the count.	2022-02-13 16:46:12 +00:00
Martin Ling	21dabc920f	Replace M0 state offset field with a byte count. Instead of this count wrapping at the buffer size, it now increments continuously. The offset within the buffer is now obtained from the lower bits of the count. This makes it possible to keep track of the total number of bytes transferred by the M0 core. The count will wrap at 2^32 bytes, which at 20Msps will occur every 107 seconds.	2022-02-13 16:46:12 +00:00
Martin Ling	98df8c23be	Fix a typo.	2022-01-03 18:48:04 +00:00
Martin Ling	42a7c5ede9	Add a label at the end of the code to indicate the literal pool. This makes objdump disassembly of the code a bit clearer, by separating the constants from code following the last label.	2022-01-03 18:48:04 +00:00
Martin Ling	59be1fef5a	Add pseudocode for all instructions. This is intended to make the code possible to follow without knowledge of the ARM instruction set.	2022-01-03 18:48:04 +00:00
Martin Ling	030898315d	Remove unused constants. Neither of these constants was used in the code.	2022-01-03 18:48:04 +00:00
Martin Ling	14065bb69d	Initialise M0 state at startup. This is not currently essential, since the current M4 code will not trigger an SGPIO interrupt until the offset and tx fields are set. In future though, we want to explicitly set up the M0 state here.	2022-01-03 18:48:04 +00:00
Martin Ling	5df28efb3f	Assign names to registers used for temporary purposes. This is just to improve readability; there is no change to the code.	2022-01-03 18:48:04 +00:00
Martin Ling	e531fb507b	Use faster way to calculate buffer pointer. One of the few instructions that can use the high registers (r8-r14) is the add instruction, which can add any two registers, as long as one of them is also used as the destination register. By using this form of add , we can add buf_base (in a high register) to the offset within the buffer (in a low register), to get the desired pointer value (buf_ptr) which we want to access. This saves one cycle by eliminating the need to move buf_base to a low register first.	2022-01-03 18:48:04 +00:00
Martin Ling	c1c665d5b8	Stash which interrupt bits were set, and use them to clear. The lsr instruction here shifts the value in r0 right by one bit, putting the LSB into the carry flag. By setting the destination register to r1, we can retain the original unshifted value in r0, and later write this to the INT_CLEAR register in order to clear all bits that were set. This saves two cycles by avoiding the need to load an 0xFFFF value to write to INT_CLEAR.	2022-01-03 18:48:04 +00:00
Martin Ling	bd7d0b9194	Correct a misleading comment. The effect of the lsr instruction here is to shift the LSB of r0 into the processor's carry flag. The subsequent bcc instruction ("branch if carry clear") will then branch if this bit was zero. The LSB corresponds to the exchange interrupt flag for slice A only. The other interrupt flag bits are not checked here, contrary to the comment.	2022-01-03 18:48:04 +00:00
Martin Ling	f8ea1e8e56	Use stack pointer to hold base address of state structure. Keeping the base address of this structure in a register allows us to use offsets to load individual fields from it, without needing their individual addresses. However, the ldr instruction can only use immediate offsets relative to the low registers (r0-r7), or the stack pointer (r13). Low registers are in short supply and are needed for other instructions which can only use r0-r7, so we use the stack pointer here. It's safe to do this because we do not use the stack. There are no function calls, interrupt handlers or push/pop instructions in the M0 code. This change saves four cycles by eliminating loads of the addresses for the offset & tx registers, plus a further two by eliminating the need to stash one of these addresses in r8.	2022-01-03 18:48:04 +00:00
Martin Ling	2f26ebffd4	Keep buffer base & size mask in high registers. The high registers (r8-r14) cannot be used directly by most of the instructions in the Cortex-M0 instruction set. One of the few instructions that can use them is mov, which can use any pair of registers. This allows saving two cycles, by replacing two loads (2 cycles each) with moves (1 cycle each), after stashing the required values in high registers at startup.	2022-01-03 18:48:04 +00:00
Martin Ling	8f43dc1be5	Use a register to hold base address of SGPIO interrupt registers. This allows us to use ldr/str with an immediate offset to access the SGPIO interrupt registers, rather than first having to load a register with the specific address we want to access. This change saves a total of 6 cycles, by eliminating two loads (2 cycles each), one of which could be executed twice.	2022-01-03 18:48:04 +00:00
Martin Ling	9206a8b752	Free up two registers by accessing SGPIO in two 16-byte chunks. The current code does reads and writes in two chunks: one of 6 words, followed by one of 2. Instead, use two chunks of 4 words each. This takes the same number of total cycles, but frees up two registers for other uses. Note that we can't do things in one chunk, because we'd need eight registers to hold the data, plus a ninth to hold the buffer pointer. The ldm/stm instructions can only use the eight low registers, r0-r7. So we have to use two chunks, and the most register-efficient way to do that is to use two equal chunks.	2022-01-03 18:48:04 +00:00
Martin Ling	c6362381d1	Initialise register with a constant value before SGIO loop. Previously this register was reloaded with the same value during each loop. Initialising it once, outside the loop, saves two cycles. Note the separation of the loop start ("loop") from the entry point ("main"). Code between these labels will be run once, at startup.	2022-01-03 18:48:04 +00:00
Martin Ling	f61a03dead	Assign names to registers which are used for a single purpose. This is just to improve readability; there is no change to the code.	2022-01-03 18:48:04 +00:00
Martin Ling	dc0f8f48c5	Use defines for offsets into SGPIO shadow registers. This is just to make the SGPIO code less cryptic, and to place the explanation of the offsets closer to where they are defined.	2022-01-03 18:48:04 +00:00
Martin Ling	3d9802260e	Document purpose and timing of existing M0 code. This commit does not modify the code; it only updates comments.	2022-01-03 18:47:24 +00:00
Mike Walters	e76eace09d	Use the M0 to collect SGPIO samples	2020-01-20 14:22:30 +00:00

44 Commits