Page MenuHomePhabricator

Fix PawSort (intro_sort and heap_sort)
ClosedPublic

Authored by lulu731 on Aug 13 2018, 8:34 AM.

Details

Reviewers
jcmcdonald
Commits
rPc99e7e0f1722: Fix PawSort (intro_sort and heap_sort)
Required Signatures
Restricted Legalpad Document
Summary

Implemented introspective sort :

  • DPQS;
  • pivots selected thanks to median-of-3 algo;
  • quitting recursive sort through depth (apply heap-sort) or threshold (number of item in array < 17).

Based on http://www.cs.rpi.edu/~musser/gp/introsort.ps

Test Plan

Added tests for insertion_sort.

Diff Detail

Repository
rP PawLIB
Lint
Lint Not Applicable

Event Timeline

Restricted Application completed remote builds in Restricted Buildable.Aug 13 2018, 8:36 AM

Build is green

Building PawLIB on LittleXenial as build 466.

https://jenkins.mousepawmedia.net:8449/job/pawlib_build/466/ for more details.

Thanks @lulu731! However, before I can review, we need your signature on {L5} for legal reasons. If you have any questions about our Terms of Development, don't hesitate to ask. :)

Some super minor things. You can probably address these at the same time as your additional improvements; just keep working on the same branch, and run the usual...

git add .
git commit
arc diff

...to update this Differential.

pawlib-source/include/pawlib/pawsort.hpp
162–168

The sparse commenting was my bad. Would you be so kind as to intent-comment this following our Commenting Showing Intent standard? (https://standards.mousepawmedia.com/csi.html)

In short, each logical statement (or at least logical section) should have a comment describing the "intent" or "why". If, in later reading or code review, the code and intent are found to mismatch, one can generally assume there's a bug.

170–173

Much better name than tmp. Thank you. :)

183

Actually, the prefix notation was deliberate. When its standalone like this, it's a minor performance gain in some circumstances, moot in others. Compiler optimization aside, --j; is one less instruction than j--;.

302–303

What's this do? (By the way, you can reply to inline comments right here on the Differential. Just tap the reply button to the upper right of this box.

This revision now requires changes to proceed.Aug 14 2018, 9:19 AM
This comment was removed by jcmcdonald.
jcmcdonald retitled this revision from Fixed insertion_sort used in intro_sort. modified : pawlib-source/include/pawlib/pawsort.hpp modified : pawlib-source/include/pawlib/pawsort_tests.hpp to Fix Insertion Sort.Aug 15 2018, 8:42 AM
jcmcdonald edited the summary of this revision. (Show Details)
jcmcdonald updated the revert plan for this revision. (Show Details)
lulu731 marked 3 inline comments as done.

introsort implemented with DPQS, median-of-3 pivots and heap_sort.

Updating D240: Fix Insertion Sort

  • implementation of introsort;
  • tests for insertion sort added.
Restricted Application failed remote builds in Restricted Buildable!Sep 7 2018, 11:17 AM

Build has FAILED

Building PawLIB on LittleXenial as build 472.

Link to build: https://jenkins.mousepawmedia.net:8449/job/pawlib_build/472/
See console output for more information: https://jenkins.mousepawmedia.net:8449/job/pawlib_build/472/console

@lulu731 Oh! I think I know what's going on.

In D241: PawLIB: Use CPGF 1.6.0/rPf93584d74597: PawLIB: Use CPGF 1.6.0, I upgraded PawLIB to use CPGF 1.6.0, and I updated the dependencies on Jenkins to provide CPGF 1.6.0. However, because you last pulled from master before D241 landed, you've got old code.

In other words, the fix is simple: switch to and update your copy of master, and then rebase your introsort branch off the new master. Before you do this, make sure you don't have any uncommitted changes; if you have uncommitted changes, git stash them first, and then git stash pop them after the rebase.

git checkout master
git pull origin master
git checkout introsort
git rebase master
jcmcdonald retitled this revision from Fix Insertion Sort to Fix PawSort.Sep 7 2018, 11:37 AM

In addition to my above comment, could you please update the description on this Differential Revision (Edit Revision up above) to reflect the whole gamut of changes you're making? Thanks!

The lint warnings may not be strictly necessary, but they're really a good idea to address. We really do need to get into the habit of using override around here, and I need to introduce it into the code elsewhere. Our standard is to only land code once the lint is clean. (In the rare case the linter is in actual error, we manually suppress the warning on that particular line; that's not necessary to do here.)

This revision now requires changes to proceed.Sep 8 2018, 6:21 PM
lulu731 retitled this revision from Fix PawSort to Fix PawSort (intro_sort and heap_sort).Sep 9 2018, 5:11 AM
lulu731 edited the summary of this revision. (Show Details)
lulu731 edited the test plan for this revision. (Show Details)
  • Added tests for insertion_sort.
  • First implementation of intro_sort. Tests pass.
  • introsort algorithm implemented with median of 3 and 2 pivots.
  • Implementation of intro_sort (DPQS + median-of-3 + heap_sort)
  • Override specifier for function in TestInsertionSort class.

Build is green

Building PawLIB on LittleXenial as build 473.

https://jenkins.mousepawmedia.net:8449/job/pawlib_build/473/ for more details.

Restricted Application completed remote builds in Restricted Buildable.Sep 9 2018, 5:29 AM
jcmcdonald added inline comments.
.gitignore
24

Why are we untracking .gitignore? This needs to be tracked!

This revision now requires changes to proceed.Sep 9 2018, 5:31 AM

Great progress! All the tests are passing. (See below in this comment for the benchmark results).

A few things need to be sorted out about the tests. See inline comments.

Also, (as mentioned in the previous comment, please don't add .gitignore to .gitignore. That file should be tracked.

Third, it seems the PawSort test suite is being registered under the wrong name (P-sB40 instead of P-sB30). It appears that the OneString suite was mistakenly given this suite's assigned ID...in other words, they're swapped. Could you untangle that in pawlib-tester/main.cpp:87-88?

In other news, here's the benchmark results for PawSort right now! You can run each of these via ./tester --benchmark P-tB3001 (or whatever the ID is):

TestResult (Baby Bear, Adjusted)
P-tB3001: Sorted ArrayWon (+220495 cycles)
P-tB3002: Reversed ArrayLost (-118451 cycles)
P-tB3003: Nearly, Invert 2Lost (-66325 cycles)
P-tB3004: Nearly, Invert 5Lost (-155802 cycles)
P-tB3005: Few UniqueLost (-735967 cycles)
P-tB3006: Black SheepLost (-72129 cycles)
P-tB3007: Double ClimbWon (+63023 cycles)
P-tB3008: Double DropWon (+564468 cycles)
P-tB3009: StairsWon (+285690 cycles)
P-tB3010: MountainWon (+501040 cycles)
P-tB3011: Double MountainLost (-197263 cycles)
P-tB3012: EverestWon (+381001 cycles)
P-tB3013: CliffWon (+648601 cycles)
P-tB3014: SpikeWon (+454654 cycles)
P-tB3015: CliffWon (+699406 cycles)
P-tB3016: NightmareLost (-1056965 cycles)

After landing this, we'll want to look into why the other tests fail. This algorithm seems especially vulnerable to scenarios with few unique items, or where the array is *mostly* sorted. Nightmare is also a problem: I wrote it specifically to cause problems for std::sort.

NOTE: As you might know, you never run benchmarks on Debug, as -g can throw things off (especially in our favor).
pawlib-source/src/pawsort_tests.cpp
42–43

Rename to P-tB3010

42–43

Change this line to TestPawSort(...), true

43–45

...and change this line to TestStdSort(...).

Somehow I inverted them.

63–65

I just realized...why do we have this scenario twice? Was it supposed to be DOUBLE_CLIFF or something like that?

71–133

Insertion Sort is always going to be slower than any decent implementation of Introsort (std::sort is one). Thus, the comparative tests are pointless for InsertionSort.

Rewrite each of these entries to only have the first two arguments (the main test, and true for "run automatically with the suite").

In other words...

register_test("P-tB3021",
    new TestInsertionSort(TestSort::ARRAY_SORTED), true);
107

Drop the * from this name. That's part of an old test ID system we no longer use.

127–129

Same as before, is this supposed to be DOUBLE_CLIFF or something? (My mistake, I suspect).

lulu731 marked 13 inline comments as done.
  • Fix minor errors on .gitignore, pawlib-source/src/pawsort_tests.cpp, pawlib-tester/main.cpp
  • Fix PAWSORT tests scenarios.

Bravo!

Landing it to master is simple: arc land introsort.

Then, you can make a new branch and start on whatever's next.

This revision is now accepted and ready to land.Sep 12 2018, 12:50 PM
This revision was automatically updated to reflect the committed changes.
Restricted Application completed remote builds in Restricted Buildable.Sep 14 2018, 10:40 AM

Build is green

Building PawLIB on LittleXenial as build 480.

https://jenkins.mousepawmedia.net:8449/job/pawlib_build/480/ for more details.