Change in coreboot[master]: [POC]lint: Add alphabetical sorting check - coreboot-gerrit

21 Mar 2021

Attention is currently required from: Julius Werner.
Hello Julius Werner,
I'd like you to do a code review. Please visit
https://review.coreboot.org/c/coreboot/+/51673
to review the following change.
Change subject: [POC]lint: Add alphabetical sorting check
......................................................................
[POC]lint: Add alphabetical sorting check
Change-Id: Ia60743fadbd7e5c044726eb566bcf976098c4162
Signed-off-by: Nico Huber nico.h@gmx.de
---
A util/lint/alphabetical.awk
1 file changed, 203 insertions(+), 0 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/73/51673/1

diff --git a/util/lint/alphabetical.awk b/util/lint/alphabetical.awk
new file mode 100644
index 0000000..40df2ce
--- /dev/null
+++ b/util/lint/alphabetical.awk
@@ -0,0 +1,203 @@
+# FSM
+# ===
+#
+# Trust me, writing the code was a lot easier than documenting this ;)
+# Let's start with a problem statement:
+#
+#    Find added lines in a patch that break an _existing_ alphabetical order.
+#
+# Notation
+# --------
+# We keep track of the last seen *existing* (e), *existing or removed*
+# (e|r) and *added* (a) lines that match a list pattern. We have to
+# compare new matches to the previous ones. For instance:
+#
+#   * `exist. < e`  -- The currently matched existing and the last
+#                      seen existing lines violate the order.
+#
+#   * `added < a|e` -- The currently matched added line is out of
+#                      order either with the last seen added or
+#                      existing line.
+#
+# Termination
+# -----------
+# Every state will return to the `OK` state when the end of a list
+# is reached. If we return from the `BAD` state to the `OK` state,
+# that means existing alphabetical order is harmed.
+#
+# Detection of the list end is complicated: If we see an added line
+# that ends a list, we can't know yet if the original list was or-
+# dered. Hence, if an existing list is split by the patch, we can
+# only make a statement about added lines in the last hunk, when we
+# decided if the original list was ordered or not. This risks false
+# negatives. A proper solution would push all new lists with a `BAD`
+# final state on a stack so their result can be decided once the
+# existing list's order is decided.
+#
+# Transitions
+# -----------
+# If any of the existing or removed matches violates the order,
+# we go to the `IGNORE` state that we only leave at the end of
+# the list.
+#
+# If an added line is not ordered (`added < e`, `added < a` or
+# `exist. < a`), we go to the `BAD` state. The only way out from
+# there is the `IGNORE` state, if we later detect that the exis-
+# ting list actually wasn't ordered in the first place.
+#
+#                                ----
+#                      .--------| OK |<-.
+#                     /          ----    `.
+#                    /            |       : reset
+#                   /  exist.< a  |      ':
+#                  |   added < a|e|     ' :
+#                  |              v    '  :
+#     exist. < e|r |            ----- '   :
+#     remov. < e|r |           | BAD |    :
+#                  |            -----     :
+#                  |              |       :
+#                  |   exist.< e|r|       :
+#                  |   remov.< e|r|       :
+#                   \             v      '
+#                    \         -------- '
+#                     `------>| IGNORE |
+#                              --------
+#
+
+function debug(message) {
+	if (DEBUG != 1)
+		return;
+
+	printf("%d: %s\n", line, message);
+}
+
+function set_state(new_state) {
+	if (state != new_state)
+		debug(new_state);
+	state = new_state;
+}
+
+function reset_state(full) {
+	set_state("OK");
+	contender = "";
+	if (full)
+		last_er = "";
+	last_e = "";
+}
+
+function check_state() {
+	if (state != "BAD")
+		return;
+
+	if (last_er == "") # remove to check new lists as well
+		return;
+
+	if (!printed) {
+		printf("Please maintain alphabetical order.\n");
+		printed = 1;
+	}
+	printf("%s:%d: %s\n", file, contender_line, contender);
+}
+
+function violates(a, b) {
+	return b != "" && a < b;
+}
+
+function match_existing() {
+	if (state == "IGNORE")
+		return;
+
+	existing = substr($0, 2);
+	if (violates(existing, last_er))
+		set_state("IGNORE");
+	else if (violates(existing, contender))
+		set_state("BAD");
+
+	last_er = existing;
+	last_e = existing;
+}
+
+function match_removed() {
+	if (state == "IGNORE")
+		return;
+
+	removed = substr($0, 2);
+	if (violates(removed, last_er))
+		set_state("IGNORE");
+
+	last_er = removed;
+}
+
+function match_added() {
+	if (state != "OK")
+		return;
+
+	added = substr($0, 2);
+	if (violates(added, contender) || violates(added, last_e))
+		set_state("BAD");
+
+	contender_line = line;
+	contender = added;
+}
+
+BEGIN {
+	reset_state(1);
+}
+
+/^+++ / {
+	file = substr($0, 5);
+	debug(file);
+	include = file ~ /.(c|h|asl)$/;
+	kconfig = file ~ //Kconfig$/;
+}
+
+/^@@ / {
+	line = substr($0, match($0, /+[0-9]+/) + 1, RLENGTH) - 1;
+}
+
+/^[+ ]/ {
+	line = line + 1;
+}
+
+/^ / {
+	if (include && /^.[[:space:]]*#include </ ||
+	    kconfig && /^.[[:space:]]*select /) {
+		match_existing();
+		debug($0);
+		next;
+	}
+}
+
+/^-/ {
+	if (include && /^.[[:space:]]*#include </ ||
+	    kconfig && /^.[[:space:]]*select /) {
+		match_removed();
+		debug($0);
+		next;
+	}
+}
+
+/^+/ {
+	if (include && /^.[[:space:]]*#include </ ||
+	    kconfig && /^.[[:space:]]*select /) {
+		match_added();
+		debug($0);
+	} else {
+		# In the added lines, a new list starts, but we
+		# don't know yet if the original was ordered. We
+		# might create false negatives this way. Test
+		# results look good, though.
+		reset_state(0);
+	}
+	next;
+}
+
+/^[ -]/ {
+	check_state();
+	reset_state(1);
+	debug($0);
+}
+
+END {
+	check_state();
+}
-- 
To view, visit https://review.coreboot.org/c/coreboot/+/51673
To unsubscribe, or for help writing mail filters, visit https://review.coreboot.org/settings

Gerrit-Project: coreboot
Gerrit-Branch: master
Gerrit-Change-Id: Ia60743fadbd7e5c044726eb566bcf976098c4162
Gerrit-Change-Number: 51673
Gerrit-PatchSet: 1
Gerrit-Owner: Nico Huber nico.h@gmx.de
Gerrit-Reviewer: Julius Werner jwerner@chromium.org
Gerrit-Attention: Julius Werner jwerner@chromium.org
Gerrit-MessageType: newchange