Skip to content

Sidecar tofino sequencing is confusing if we've already timed out. #2324

@nathanaelhuffman

Description

@nathanaelhuffman

I troubleshot a system in mfg where, due to test-station SWD shenanigans, the SP was being interrupted in the critical window between enabling the Tofino power supplies and adjusting the voltage rails and acknowledging via the VidAck mechanism.

You end up with ringbuffs that look like this:

humility: ring buffer drv_sidecar_seq_server::__RINGBUF in sequencer:
 NDX LINE      GEN    COUNT PAYLOAD
  16   91        1        1 FrontIOControllerIdent { fpga_id: 0x1, ident: 0x1deaa55 }
  17   98        1        1 FrontIOControllerChecksum { fpga_id: 0x1, checksum: [ 0xd4, 0xaa, 0x2a, 0x16 ], expected: [ 0xd4, 0xaa, 0x2a, 0x16 ] }
  18  344        1        1 TofinoSequencerTick(LatchOffOnFault, A2 { error: None })
  19  154        1        1 FanModuleLedUpdate(Zero, On)
  20  154        1        1 FanModuleLedUpdate(One, On)
  21  154        1        1 FanModuleLedUpdate(Two, On)
  22  154        1        1 FanModuleLedUpdate(Three, On)
  23  344        1        3 TofinoSequencerTick(LatchOffOnFault, A2 { error: None })
  24  245        1        1 FrontIOBoardPowerGood
  25  328        1        1 FrontIOBoardPhyPowerEnable(true)
  26  550        1        1 FrontIOBoardPhyOscGood
  27  344        1        1 TofinoSequencerTick(LatchOffOnFault, A2 { error: None })
  28   81        1        1 TofinoPowerUp
  29   89        1        1 TofinoVidAttempt(0x0)
  30  262        1        1 TofinoNoVid
  31   89        1        1 TofinoVidAttempt(0x1)
   0  262        2        1 TofinoNoVid
   1   89        2        1 TofinoVidAttempt(0x2)
   2  262        2        1 TofinoNoVid
   3   89        2        1 TofinoVidAttempt(0x3)
   4  262        2        1 TofinoNoVid
   5   89        2        1 TofinoVidAttempt(0x4)
   6  262        2        1 TofinoNoVid
   7   89        2        1 TofinoVidAttempt(0x5)
   8  262        2        1 TofinoNoVid
   9   89        2        1 TofinoVidAttempt(0x6)
  10  262        2        1 TofinoNoVid
  11   89        2        1 TofinoVidAttempt(0x7)
  12  262        2        1 TofinoNoVid
  13  796        2        1 TofinoSequencerError(SequencerTimeout)
  14  286        2        1 TofinoSequencerAbort { state: InPowerUp, step: AwaitVidAck, error: VidAckTimeout }
  15  344        2       11 TofinoSequencerTick(LatchOffOnFault, A2 { error: VidAckTimeout })

Which can be a bit confusing because we were already timed out when we entered iteration 0 of the loop. Getting some better reporting about this case would have significantly sped up debug here.

We tried 8 loops, but didn't look to see if we had timed out already before entering so you're left thinking that the timeout happened during the loops when in fact it occurred before, but we wait until all 8 loops finish before giving up and then going down the error path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions