Page MenuHomePhabricator

Initial Tree for Goldilocks Coordinator

Authored by jdtolmie on Feb 14 2022, 1:32 PM.


rG4ca23a2094ab: Initial Tree for Goldilocks Coordinator
Required Signatures
Restricted Legalpad Document
Test Plan

Nothing was done. Just created the files.

Diff Detail

rG Goldilocks
Lint Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application completed remote builds in Restricted Buildable.Mar 1 2022, 4:04 PM
  • Made changes to coordinator.cpp and hpp to create child nodes in the class
Restricted Application completed remote builds in Restricted Buildable.Mar 7 2022, 3:49 PM

I suspect this isn't the correct description. What does this part of the code do?




The explicit keyword should be fine for now, but be aware that this may become an issue later if you try to pass a std::string object to the constructor. explicit means that any arguments passed in will not be cast to the expected type.


The node really doesn't need to know this. Within a single execution of the program, the tests in the suite will only be loaded once, and they'll be loaded the same way whether some of the tests will be run, or whether all of the tests will be run.

I can't really find good words to explain this.......aghhh.


I notice that you have one parent for the whole Coordinator instance, which I understand to be your effective "Tree" class. What does parent specifically mean in your usage: the root of the tree, or the parent of the particular node?


Lint told me to add this. I didn't until then.


I removed it in some previous work I was doing, but have not updated yet. Sorry.


This is where I am just totally lost. I am trying to get the initial class call to have a parent of nullptr, and all calls to the object afterward to be a level down from the previous calls to add_child (root= nullprt. second=root.third=second etc). It's been two weeks of spinning my wheels on this.

  • Total rewrite of coordinator.hpp and .cpp. Based off of tree Jason C. McDonald wrote in python.
Restricted Application completed remote builds in Restricted Buildable.Apr 4 2022, 3:34 PM

Jack, you are DEFINITELY on the right track! I sense that you're starting to get a bit of traction in this task, and I'm liking what I'm seeing.

I've left a fair bit of feedback inline. Don't be discouraged by it. I won't provide this level of feedback to someone I don't feel is capable of doing good something with it. If you want/need more information, I encourage you to reply directly to comments inline, or to discuss on ~town-square on Mattermost.

One general note besides inline comments. I believe you had disabled clang-format from running automatically on arc diff, so you'll need to run it manually to fix some formatting. One way is to run the following from the root of the repository.

shopt -s globstar
clang-format -i --style=file **/*.{hpp,cpp}

Definitely not a fan of the Hungarian notation here. I tried to give it a chance before I said something, but in general, things like scope and type should not be embedded in the name. The notation used to access this variable will be enough to speak to its membership, such as this->node_name or some_node.node_name.

The one exception is, if a member variable is unavoidably public, but should not be touched, you might start it with a single underscore...but that's more of a Python convention that I sometimes bring into C++.


Our house style is to put each statement on its own line, and we always use curly braces. This is chiefly helpful for readability, but it also is insurance against some accidental syntax errors when the code is changed later.

for (auto& node: m_children) {
    if (node->m_node_name == child_name) {
        return node;

As a rule in prod code, readability is king. Avoid one-liners like line 86 is now unless you have very clear, situation-specific reasons to do otherwise. (In practice, you may go an entire career without encountering such a situation.)


That's smart.


What is this variable for?


Just a suggestion, we might use the function name register here. While we are adding a new node to a tree, that's more of an internal implementation detail that a user of Coordinator shouldn't care about. To the user of Coordinator, they are "registering" a test or suite.

To that end, the parameter name on_node might make more end-user sense as just parent.

(To be clear, I DO recommend keeping Node::add_child named as it is! I'm only suggesting renaming Coordinator::add to Coordinator::register)


A second, more technical, bit of feedback here is to provide some sort of default value for on_node/parent, so it can be omitted if someone intends to register on the root. Perhaps the default value is an empty string, and then you check in the body of this function for an empty string on on_node; if it is, you add on root.


Hmm. The thought just occurred to me that the use of a template may send you on a wild goose chase on a technical detail. That's my fault. Consider if you have a Node<TestFlexArray_Push> and a Node<TestFlexStack_Pop> in your tree. What would this function need to return? Node<Runnable> would not be quite right, because Runnable is only the parent class, not the child class.

If you just said, "Wait, huh?", you're actually following me. It's confusing, both to us and to the compiler. I forgot that templates and inheritance do not always play well with each other.

The good news is, the way forward is a lot simpler. We don't need templates on the Node at all! Instead, you can drop the template from Node altogether, and change std::shared_ptr<T> m_runnable; to std::shared_ptr<Runnable> m_runnable;.

I told you I overengineer things!

I recommend you read as well. This will be helpful information to either learn or revisit as you move forward, since Runnable is just an abstract parent of Test and Suite, themselves abstract parents of every test and suite written to be used with Golidlocks.


This comment isn't needed. We have git blame to tell us who added things, and string is clearly used below.


If you want to make it less useless, you can move it to types.hpp, and give it a single abstract method run(), like this:

class Runnable
    itemname_t name;
    testdoc_t docstring;

    Runnable(itemname_t name, testdoc_t docstring)
    : name(name), docstring(docstring)

    bool run() = 0;

Then in suite.hpp, drop line 65 (the old definition of Runnable), and set TestSuite to inherit from Runnable. You'll also need to adjust its constructor to use the name and docstring variables (and constructor) of its parent, and have a bool run() function:

class TestSuite(Runnable)
	std::unordered_map<itemname_t, Runnable> runnables;
	std::unordered_map<itemname_t, Test*> compares;

	TestSuite(itemname_t suite_name, testdoc_t docstring)
	: Runnable(suite_name, docstring)

        virtual void run() {
            // TODO: We'll have to tackle this function later; just put the stub in place.

	virtual void load() = 0;
        // ...

Likewise, in test.hpp...

class Test(Runnable)
        // ...
	Test(testdoc_t test_name, testdoc_t docstring)
	: Runnable(test_name, docstring)

       // notice that I removed the old `test_name` and `doc_string` fields from this object

	virtual bool pre() { return true; }

       // ...

You may need to check the rest of the code for any references to the old Suite::suite_name, Suite::suite_desc, Test::test_name, and Test::doc_string. Instead, both Suite and Test will have members name and docstring via inheritance, which provides a more consistent interface.

The important thing is that any assumptions your Coordinator/Node code is going to make about Tests and Suites should be incorporated into Runnable. (If you need help understanding what I mean by this, let me know.)


You already defined this class in coordinator.hpp, so you should NOT redefine it here. Your #include on line 1 of this file has already brought it in. In general, only definitions of functions that you prototyped in the .hpp file should be here.


Can you elaborate why you'd skip if the string is "root"? Wouldn't the root node be implied at the start of the string? That is, "apple.gala" implies "".

Just to get a touch pedantic about it, how do you know someone won't actually want to use the literal string name "root" for a test? For example: ""?

In general, we try not to attach special meanings to arbitrary literal values like strings and numbers. It's impossible to imagine all the possible values a user could throw at a system within the natural constraints we set.


This comment is only restating implementation, which isn't useful. Instead, consider something like this...


This is okay, but I find an explicit check is a little more readable.


You may want to return here instead, or even better, raise an exception. Not finding the user-requested node is technically unrecoverable; the user must decide what to do next (e.g. change their search). Exceptions are good for "exceptional" situations like this, where this function cannot be expected to figure out how to recover ("unrecoverable error").

As the code stands, if the user searches for, and does not exist, but fruit.gala does, then it'll return fruit.gala. That's clearly not the intended outcome.

This revision now requires changes to proceed.Apr 6 2022, 4:13 PM
jdtolmie marked 12 inline comments as done.
  • Changed find_node and load to take vector<string>, not a string. Fixed issue with various segmentation faults, and now 'find' should work properly, as it was finding a node without looking up the tree structure
Restricted Application completed remote builds in Restricted Buildable.Apr 20 2022, 3:41 PM

Here is the full code I am working with, in order to test it:


That was my overlooking it. I was working on this as a single file on another system. I copied and pasted it into the .hpp and .cpp. I forgot to remove it from this file.

  • Fixed segmentation faults. Fixed issues with 'find_node' and 'load' not adding all of the elements of the vector passed to them.
Restricted Application completed remote builds in Restricted Buildable.Apr 29 2022, 5:56 PM

No blockers from what I can see! There are some notes for subsequent tasks, and some optional improvements you can make.


If you're building via make ready or make tester_debug (our build system), then this import should work as #include "#iosqueak/stringy.hpp"


Assuming children is std::vector<std::shared_ptr<Node>>, then you only need to pass the constructor arguments themselves to emplace_back.


More information


Nit (meaning, this is a suggestion that doesn't necessarily have to be followed through on): if you have a raw getter for node_name, would it make more sense to just make node_name be a public attribute?


For later (NOT now!), we should rewrite this function to instead return a string with the same data, such that the Goldilocks Shell has full control of when and how this is displayed, including formatting. Consider opening a feature ticket.


Is this the part of the code you mentioned would create a node for each part of the path?

Here's my suggestion from the meeting again. However, it's probably best if you make this a subsequent task and Differential, instead of implementing right now.

  1. For each node name, in the parent, find the node on its parent in the path. e.g. given apple.gala, search for apple on root, gala on apple, etc.
  1. If the node exists on the parent AND is a leaf (no children), call load() on the node. This should load all the runnables in the node's associated callable in as children. That is, if you call load() on the apple node, and it had an associated apple suite, and that suite had within it the runnables gala, fuji, and jonigold, you'd create nodes for each of those callables, and store those nodes as children of the apple node. Then, proceed to step 3.
  1. If the node exists on the parent AND is NOT a leaf, return to step 1 for the next node.
  1. If the node exists, and there are no nodes after it in the path, call run() the node's callable.

There are a couple exceptional situations that will also need to be handled:

  • If a node in the path is a test (not a suite) AND is NOT the last node in the path, there's a problem with the path. (Tests don't contain other runnables.)
  • If a node in the path does not exist after load() is called on the parent, it should fail explicitly (e.g. an exception or an appropriate error-indicating return value.)
This revision is now accepted and ready to land.Apr 30 2022, 6:10 PM
jcmcdonald retitled this revision from Created .cpp and .hpp for coordinator project to Initial Tree for Goldilocks Coordinator.Apr 30 2022, 6:10 PM
jdtolmie marked 2 inline comments as done.
  • Made change to add_child to put a new Node, not a shared_ptr<Node>
This revision now requires review to proceed.May 10 2022, 3:00 PM
Restricted Application completed remote builds in Restricted Buildable.May 10 2022, 3:00 PM
This revision is now accepted and ready to land.May 11 2022, 7:22 PM

Great work! I've just added a couple of comments relating to the lint errors, which we want to address.

We actually have a checklist for landing Differentials. For (4), Harbormaster/Jenkins is unavailable. You can also ignore (11) and (13) for the moment; you will need to write docs later.

1Before landing, each Revision should...
3(1) Have an associated Maniphest Task.
4(2) Accomplish the feature(s) it was designed to accomplish. [In some cases, the feature itself may be dropped, and only bugfixes and/or optimizations landed instead.]
5(3) Have merged/rebased all changes from `devel` into itself, and all conflicts resolved. ($ git pull --rebase origin devel)
6(4) Have binaries and unnecessary cruft untracked and removed. (Keep an eye on .gitignore!)
7(5) Compile and run properly - this should be confirmed via Harbormaster/Jenkins (if available).
8(6) Be free of compiler errors and warnings (must compile with `-Wall -Wextra -Werror`).
9(7) Be Valgrind pure (no memory leaks detected).
10(8) Comply with Coding and Technical standards.
11(9) Be free of linter errors. ($ arc lint --lintall)
12(10) Be fully CSI commented.
13(11) Have an up-to-date build script (generally CMake) if relevant.
14(12) Contain relevant LIT tests, if the project is Goldilocks capable.
15(13) Have a Test Plan, generally containing a list of Goldilocks tests the reviewer should run.
16(14) Be reviewed, built, tested, and approved by at least one trusted reviewer (Staff or Trusted Contributor).
17(15) Have up-to-date Sphinx documentation, which compiles with no warnings.
18(16) Have all reviewer comments processed and marked "Done".


Usually we want to address lint errors. However, this particular lint error is not valid in this case; we want to allow casting a string literal or std::string to a std::string_view, so we do not want the constructor to be explicit (no implicit casting allowed).

Instead, you can suppress this lint error by adding a `// cppcheck-suppress comment above the line the lint error is on.


Same thing here.


This lint suggestion may be worth looking into, as it would use only two lines of code, instead of three.

Here's one possible way to do this, with a lambda (anonymous function) for the third argument to find_if.

Bear in mind, this will change the logic slightly, as std::find_if returns an iterator to the node, which has to be dereferenced with *node in my code. However, if it could not find the result, it will return children.last() instead. I test for that in a ternary statement here.

NOTE: The code is untested, although based on the documentation. You will need to test it out yourself and ensure it works as expected.

Mind you, there's also nothing wrong with your approach. If you would rather stick with your current technique, add the linter suppression comment above your if statement:

// cppcheck-suppress useStlAlgorithm
  • Made some changes I thought I already did. Fixed a few things Jason wanted done
This revision now requires review to proceed.Jun 3 2022, 4:40 PM
Restricted Application completed remote builds in Restricted Buildable.Jun 3 2022, 4:40 PM

Please resolve the rest of the linter errors, both from last round and this.

Keep going, you're almost there!


Maybe I'm just tired, but I can't find the outer variable named "child". Can you, or is the linter just meshugganah?


I agree with linter.

This revision now requires changes to proceed.Jun 3 2022, 7:16 PM
jdtolmie marked 4 inline comments as done.
  • Made some changes I thought I already did. Fixed a few things Jason wanted done
Restricted Application completed remote builds in Restricted Buildable.Jun 3 2022, 8:15 PM
  • Made some changes I thought I already did. Fixed a few things Jason wanted done
Restricted Application completed remote builds in Restricted Buildable.Jun 3 2022, 8:16 PM

I made the changes that lint wanted me to, and corrected a few things that I thought I had working. I will create a subtask, but I realised that if you 'add test', and then 'add test' again, it will add 'test' twice to the root. I was going to fix it, but thought if I am giving this to Jaime, that he could work on it. Let me know if you want me to do it.


I did not fill this in, because to be honest, I forgot exactly what I was supposed to be doing I was so lost working on the tree. :)


I changed it to what you liked. I was working with that, but I just found this way easier to understand, and I was struggling trying to make everything with less lines of code. I never thought of just returning it the way you wrote out, but it does save some lines over the if/else statements I had.


Sure. I will change it. I just remember in my studies saying to keep things public to a minimum. Saves people messing around with things.


It was there. It was in the for loop that printed the child node names. I renamed it.

This revision is now accepted and ready to land.Jun 4 2022, 8:53 AM