On 7/2/25 10:47 AM, Arnaud POULIQUEN wrote:
On 7/2/25 17:23, Tanmay Shah wrote:
On 7/2/25 2:18 AM, Arnaud POULIQUEN wrote:
On 7/1/25 23:19, Tanmay Shah wrote:
On 7/1/25 1:06 PM, Tanmay Shah wrote:
On 7/1/25 12:56 PM, Tanmay Shah wrote:
On 7/1/25 12:18 PM, Arnaud POULIQUEN wrote: > > > On 7/1/25 17:16, Tanmay Shah wrote: >> >> >> On 7/1/25 3:07 AM, Arnaud POULIQUEN wrote: >>> Hi Tanmay, >>> >>> On 6/27/25 23:29, Tanmay Shah wrote: >>>> Hello all, >>>> >>>> I am implementing remoteproc recovery on attach-detach use case. >>>> I have implemented the feature in the platform driver, and it works for >>>> boot >>>> recovery. >>> >>> Few questions to better understand your use case. >>> >>> 1) The linux remoteproc firmware attach to a a remote processor, and you >>> generate a crash of the remote processor, right? >>> >> >> Yes correct. >> >>> 1) How does the remoteprocessor reboot? On a remoteproc request or it is an >>> autoreboot independent from the Linux core? >>> >> >> It is auto-reboot independent from the linux core. >> >>> 2) In case of auto reboot, when does the remoteprocessor send an even to >>> the >>> Linux remoteproc driver ? beforeor after the reset? >>> >> >> Right now, when Remote reboots, it sends crash event to remoteproc driver >> after >> reboot. >> >>> 3) Do you expect to get core dump on crash? >>> >> >> No coredump expected as of now, but only recovery. Eventually will implement >> coredump functionality as well. >> >>>> >>>> However, I am stuck at the testing phase. >>>> >>>> When should firmware report the crash ? After reboot ? or during some >>>> kind of >>>> crash handler ? >>>> >>>> So far, I am reporting crash after rebooting remote processor, but it >>>> doesn't >>>> seem to work i.e. I don't see rpmsg devices created after recovery.> >>>> What should be the correct process to test this feature ? How other >>>> platforms >>>> are testing this? >>> >>> I have never tested it on ST board. As a first analysis, in case of >>> autoreboot >>> of the remote processor, it look like you should detach and reattach to >>> recover. >> >> That is what's done from the remoteproc framework. >> >>> - On detach the rpmsg devices should be unbind >>> - On attach the remote processor should request RPmsg channels using the NS >>> announcement mechanism >>> >> >> Main issue is, Remote firmware needs to wait till all above happens. Then >> only >> initialize virtio devices. Currently we don't have any way to notify >> recovery >> progress from linux to remote fw in the remoteproc framework. So I might >> have to >> introduce some platform specific mechanism in remote firmware to wait for >> recovery to complete successfully. > > I guess the rproc->clean_table contains a copy of the resource table that is > reapplied on attach, and the virtio devices should be re-probed, right? > > During the virtio device probe, the vdev status in the resource table is > updated > to 7 when virtio is ready to communicate. Virtio should then call > rproc_virtio_notify() to inform the remote processor of the status update. > At this stage, your remoteproc driver should be able to send a mailbox > message > to inform the remote side about the recovery completion. >
I think I spot the problem now.
Linux side: file: remoteproc_core.c rproc_attach_recovery __rproc_detach cleans up the resource table and re-loads it __rproc_attach stops and re-starts subdevices
Remote side: Remote re-boots after crash Detects crash happened previously notify crash to Linux (Linux is executing above flow meanwhile) starts creating virtio devices **rproc_virtio_create_vdev - parse vring & create vdev device** **rproc_virtio_wait_remote_ready - wait for remote ready** [1]
I think Remote should wait on DRIVER_OK bit, before creating virtio devices. The temporary solution I implemented was to make sure vrings addresses are not 0xffffffff like following:
while(rsc->rpmsg_vring0.da == FW_RSC_U32_ADDR_ANY || rsc->rpmsg_vring1.da == FW_RSC_U32_ADDR_ANY) { usleep(100); metal_cache_invalidate(rsc, rproc->rsc_len); }
Above works, but I think better solution is to change sequence where remote waits before creating virtio devices.
I am sorry, I should have said, remote should wait before parsing and assigning vrings to virtio device.
[1] https://github.com/OpenAMP/open-amp/ blob/391671ba24840833d882c1a75c5d7307703b1cf1/lib/remoteproc/ remoteproc.c#L994
Actually upon further checking, I think above code is okay. I see that wait_remote_ready is called before vrings are setup on remote fw side.
However, during recovery time on remote side, somehow I still have to implement platform specific wait for vrings to setup correctly.
From linux side, DRIVER_OK bit is set before vrings are setup correctly. Because of that, when remote firmware sets up wrong vring addresses and then rpmsg channels are not created.
I am investigating on this further.
Do you reset the vdev status as requested by the virtio spec? https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html#...
Regards, Arnaud
Yes I do. I am actually restoring deafult resource table on firmware side, which will set rpmsg_vdev status to 0.
However, when printing vrings right before wait_remote_ready, I see vrings are not set correctly from linux side:
`vring0 = 0xFFFFFFFF, vring1 = 0xFFFFFFFF`
That makes sense if values corresponds to the initial values of the resource table rproc->clean_table should contain a copy of these initial values.
However, the rproc state was still moved to attach when checked from remoteproc sysfs.
Does the rproc_handle_resources() is called before going back in attached state?
You are right. I think __rproc_attach() isn't calling rproc_handle_resources().
But recovery is supported by other platforms so I think recovery should work without calling rproc_handle_resources().
May be re-storing resource table from firmware side after reboot isn't a good idea. I will try without it.
`cat /sys/class/remoteproc/remoteproc0/state` attached
Somehow the sync between remote fw and linux isn't right.
Thanks, Tanmay > Regards > Arnaud > > >> >>> Regards, >>> Arnaud >>> >>>> >>>> >>>> Thanks, >>>> Tanmay >>