-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mtd/w25q: ensure the correct behavior if erase sector fails #15812
Conversation
[Experimental Bot, please feedback here] Yes, this PR appears to meet the NuttX requirements, although it could be improved with more specific information. Here's a breakdown of what's good and what could be better: Strengths:
Weaknesses:
Recommendation: Add the missing information and details described above to strengthen the PR. Clear, concise, and complete information makes it much easier for reviewers to understand and approve the changes. |
@@ -976,10 +976,10 @@ static int w25qxxxjv_erase_sector(FAR struct w25qxxxjv_dev_s *priv, | |||
} | |||
|
|||
status = w25qxxxjv_read_status(priv); | |||
if ((status & STATUS_BUSY_MASK) != STATUS_READY) | |||
while ((status & STATUS_BUSY_MASK) != STATUS_READY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michallenc should be nice to have a timeout here. Could be just a counter, it will avoid the system get blocked forever case something bad happens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@acassis Probably yes. The advantage is we know the longest possible delay (entire flash erase time), so I put a counter there with this time.
55494dc
to
4db9b6c
Compare
Function w25qxxxjv_erase wasn't correctly handling an error in w25qxxxjv_erase_sector call and was returning success even on failure. Moreover this change does not immediately return EBUSY but waits for the previous operation to finish. If the timeout is significant (more than erase time of the entire flash), then it returns EBUSY. Signed-off-by: Michal Lenc <[email protected]>
4db9b6c
to
b9c1941
Compare
@acassis any more concerns? |
@anchao did you see there is ongoing discussion in progress before you merged? |
Sorry I approved the PR too quickly. Next time I'll mark it as "Change Requested" until all pending concerns are resolved. |
IMHO that maintainers should spend more time on reading and learning code and software architecture instead of commenting, which will make it more efficient. Nuttx is already a very big project. If the pressure is put on contributors blindly, then the management method of this project is problematic. |
Okay, I will mark "changes requested" each time I have questions.
@anchao so after reading the code you saw that not only return fix is there but also introduction of new blocking behavior with arbitrary delay with no polling for success and still may return error on incomplete operation and you are okay with that? I know this is the simplest possible implementation, just wanted to make sure we are aware of the risks. Its not that I don't understand the code change, questions are asked simple way so everyone that comes in understands what is the problem and context ;-) Recent discussions on mailing lists reveal the problem we have is not with blind pressuring the contributors, quite the opposite, blind pushing and merging insufficiently validated breaking changes. We are trying to fix that, maybe it will work, maybe not. |
This fix includes the number of retries, indicating that there may be state transition issues under extreme conditions, which I think is acceptable.
Thanks for all yours efforts. I'm just stating my opinion, and it doesn't need to be accepted by everyone. Perhaps you will give some consideration to my view in certain decision - making processes. |
@cederom @anchao @lupyuen @acassis Hey guys, I am not quite sure what's the problem here. The discussion was resolved, Alan just forgot to mark it, but he approved the PR after the changes, which should be sufficient. Regarding the possible blocking. I am aware it may be an issue under extreme circumstances, but write to the flash is currently ALWAYS blocking if you have to flush the buffer and erase page. We wait for QSPI transfer to finish, we wait for the erase to finish, there is a QSPI locking mechanism, so other thread accessing the flash may wait on the lock. BCH layer also has the lock. Returning Yes, we can discuss some refactor to BCH, FTL, MTD layers to correctly handle non-blocking access, but I don't think this PR breaks something. The code won't enter the while loop in most of the cases, it is a fix for probably bad timing on some flashes. I experienced it on W25Q512, but not on W25Q01. And even on W25Q512 it's just one or two retries. |
Summary
Function
w25qxxxjv_erase
wasn't correctly handling an error inw25qxxxjv_erase_sector
call and was returning success even on failure. Moreover this change does not returnEBUSY
but waits for the previous operation to finish. ReturningEBUSY
doesn't make much sense here as we approximately know how long to wait based on flash's parameters.Impact
This fixes wrong error handling and ensures the erase waits for previous operation to finish.
Testing
Tested on SAMv7 custom board with W25Q512 and W25Q01 flashes (512 Mbits and 1 Gbit, the latter has two dies, so slightly different behavior).