Some optimizations for the hardware divider (#1033)
* Remove unnecessary wait in pico_divider. There is no need to wait if there is more than 8 cycles between setup and result readout. Dividend/divisor readout should be correct without delay. Update comment to reflect that. * Optimize hw_divider_save_state/hw_divider_restore_state. Doing multiple pushes to avoid stack usage is faster. The wait loop in hw_divider_save_state had an incorrect branch in the wait loop. This didn't matter since the wait wasn't necessary to begin with. * Remove pointless aligns in hardware_divider. The regular_func_with_section inserts a new section so if aligning is desired it should be placed in the macro after section start. * Save a few bytes in hardware_divider. Signed and unsigned code can use the same exit code. Branching to the common code is free since we need the 8 cycle delay anyway.
This commit is contained in:
@ -19,11 +19,10 @@ need to change SHIFT above
|
||||
#endif
|
||||
|
||||
// SIO_BASE ptr in r2; pushes r4-r7, lr to stack
|
||||
// requires that division started at least 2 cycles prior to the start of the macro
|
||||
.macro save_div_state_and_lr
|
||||
// originally we did this, however a) it uses r3, and b) the push takes 6 cycles, b)
|
||||
// any IRQ which uses the divider will necessarily put the data back, which will
|
||||
// immediately make it ready
|
||||
// originally we did this, however a) it uses r3, and b) the push and dividend/divisor
|
||||
// readout takes 8 cycles, c) any IRQ which uses the divider will necessarily put the
|
||||
// data back, which will immediately make it ready
|
||||
//
|
||||
// // ldr r3, [r2, #SIO_DIV_CSR_OFFSET]
|
||||
// // // wait for results as we can't save signed-ness of operation
|
||||
@ -31,7 +30,7 @@ need to change SHIFT above
|
||||
// // lsrs r3, #SIO_DIV_CSR_READY_SHIFT_FOR_CARRY
|
||||
// // bcc 1b
|
||||
|
||||
// 6 cycles
|
||||
// 6 cycle push + 2 ldr ensures the 8 cycle delay before remainder and quotient are ready
|
||||
push {r4, r5, r6, r7, lr}
|
||||
// note we must read quotient last, and since it isn't the last reg, we'll not use ldmia!
|
||||
ldr r4, [r2, #SIO_DIV_UDIVIDEND_OFFSET]
|
||||
|
Reference in New Issue
Block a user