Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macro for nondecimal bases #768

Merged
merged 19 commits into from
Feb 14, 2024

Conversation

pstaabp
Copy link
Member

@pstaabp pstaabp commented Jan 23, 2023

This provides a macro from converting decimals to and from non-decimal bases. Although not necessary, the maximum base is 16 (hexadecimal). Also, a colleague of mine uses base-12 with T and E as the digits ten and eleven in that base, so there is an optional array of digits to use.

A test file is provided and to run it enter (from PG_ROOT): prove -v t/macros_math/nondecimal_base.t

Also, a sample PG is provided:
nondecimal.pg.txt

There are plenty of examples in the test file and the sample PG file.

@dpvc
Copy link
Member

dpvc commented Jan 24, 2023

Although the code works, it is not as efficient as it could be. In particular, the conversion to base 10 can be improved in several ways. Your current loop uses two multiplications, one addition, and two assignments for each digit processed, but it can be done with one multiplication, one addition, and one assignment (see below). But the real inefficiency is in using grep to find the base-10 value for a given digit. Note that this will compare the given digit to every digit, even after it finds the correct one, and so that is a lot of looping that gets done for each digit in the original value. One option would be to join the digits into a single string and use index() to find the position (and so the base-10 value) of a given digit, but there is a better way, which is to use a lookup hash that maps the digits back to their base-10 values. I give an example in the code below.

To eliminate one multiplication and one assignment from the loop (and also the building of a new list from the reverse() call), you can evaluate the original number as a polynomial using Horner's method. Thinking of dd₂...dₙ as an n-digit base b number, we can write the base-10 equivalent as dbⁿ⁻¹ + dbⁿ⁻² + ... + dₙ₋₁b + dₙ, which can be rewritten (by factoring out successive bs) as (...((db + d₂)b + d₃)b + ... + dₙ₋₁)b + dₙ.

That means you can compute the value by taking the first digit, multiplying by b and adding the next digit, then multiply that sum by b and add the next digit, and so on. That is one multiply, one addition, and one assignment per digit (rather than two multiplies and assignments), and doesn't require you to reverse the digit list.

The following example code implements this approach, using the digit lookup table rather than the grep loop.

sub convertBase {Base::convert(@_)}

package Base;

#
#  The standard digits, pre-built so it doesn't have to be done each time the conversion is called
#
our $digits16 = [ '0' .. '9', 'A' .. 'F' ];
our $digit16 = { map { ($digits16->[$_], $_) } (0 .. scalar(@$digits16) - 1) };

sub convert {
  my $value = shift;

  #
  # Get the conversion options
  #
  my %options = (
    from => 10,
    to => 10,
    digits => $digits16,
    @_
  );
  my $from = $options{'from'};
  my $to = $options{'to'};
  my $digits = $options{'digits'};

  die "The digits option must be an array of characters to use for the digits"
    unless ref($digits) eq 'ARRAY';

  #
  # The highest base the digits will support
  #
  my $maxBase = scalar(@$digits);

  die "The base of conversion must be between 2 and $maxBase"
    unless $to >= 2 && $to <= $maxBase && $from >= 2 && $from <= $maxBase;

  #
  # Reverse map the digits to base 10 values
  #
  my $digit = ( $digits == $digits16 ? $digit16 :
                { map { ($digits->[$_], $_) } (0 .. $maxBase - 1) } );

  #
  #  Convert to base 10
  #
  my $base10;
  if ($from == 10) {
    die "The number to convert must consist only of digits: 0,1,2,3,4,5,6,7,8,9" 
      unless $value =~ m/^\d+$/;
    $base10 = $value;
  } else {
    foreach my $d (split(//, $value)) {
      die "The number to convert must consist only of digits: " . join(',', @$digits[0 .. $from - 1])
        unless defined($digit->{$d});
      $base10 = $base10 * $from + $digit->{$d};
    }
  }
  return $base10 if $to == 10;

  #
  #  Convert to desired base
  #
  my @base = ();
  do {
    my $d = $base10 % $to;
    $base10 = ($base10 - $d) / $to;
    unshift(@base, $digits->[$d]);
  } while $base10;

  return join('', @base);
}

This uses a package namespace to make the $digits16 and $digit16 values not be global, but also only have to be computed once (there are other ways to do that, but this makes it possible to modify these lists from within a problem, if needed).

It also handles the options hash by using a default hash rather than the // operator, since this makes the allowed options and their defaults more apparent. The digits option is allowed to be longer than 16, so this gives the option of using bases bigger than 16 without modifying the code (the $digits16 value could also be larger to make that the default).

The check for valid digits is done in the conversion loop rather than using a dynamically created regex, which is more expensive (also, you haven't quoted regex special characters, so someone could potentially create an invalid or unexpected regex by using things like * or ? as digits).

Finally, the conversion from base 10 to another base uses an array that is joined at the end rather than creating multiple strings, though both are about the same efficiency.

In my timing tests, the conversion to base 10 is about 4 times faster using Horner's method with the loop hash, skipping all the validity checking to just concentrate on the algorithm, and even with the checking, it is still 3 times faster. The conversion from base 10 to an alternate base is only marginally faster with the arrays rather than the strings.

I changed the Value::Error calls to plain old die, as Value::Error does that in the end anyway, and it seems odd to use a Value call for something that is not MathObject-specific. But that is a personal preference. In any case, you don't need return with Value::Error, as Value::Error throws an error condition and stops the execution of the function at that point.


I note that your examples all use contextArbitraryString.pl. That can lead to some issues (e.g., leading or trailing spaces in the student answer can be a problem, as I recall). It might be better to make an actual context that handles numbers in alternate bases. I once wrote a hexadecimal context, which was made available in this post, that might serve as a starting point. The hex() and main::spf($self->value, '%X') functions would need to be replaced by the conversion to and from base 10, but the rest of it should probably be OK as is. There would need to be a context flag indicating the base being used, so that you could change the base to the one you wanted.

This has the advantage of being able to do arithmetic in the given base, while the LimitedBase context could prevent operations, as is the case for LimitedHex in the contextHex.pl linked to the post above. I'm sure there are more features that could be added to the context. That was just a quick one I put together as an example.


Of course, these are just suggestions. You can take them or leave them as you see fit.

macros/math/nondecimal_base.pl Outdated Show resolved Hide resolved
macros/math/nondecimal_base.pl Outdated Show resolved Hide resolved
t/macros_math/nondecimal_base.t Outdated Show resolved Hide resolved
@dpvc
Copy link
Member

dpvc commented Jan 25, 2023

Also, one could use Math::Base::Convert for converting bases.

@pstaabp
Copy link
Member Author

pstaabp commented Jan 26, 2023

@dpvc Appreciate the feedback. I was thinking of getting a functional conversion first rather than an efficient one (thinking that efficiency wouldn't really matter for ww problems), but this is great.

Also, one could use Math::Base::Convert for converting bases.

I looked into this and decided that this would be simple enough to do "in house" instead of requiring another package.

I was thinking of creating a context. I'll look into your example from the forum.

@pstaabp
Copy link
Member Author

pstaabp commented Feb 11, 2023

@dpvc I'm starting to convert it to a context using your Hex example and just pushed current changes. What I was thinking was, for example, I could create a context and the set the base, then perform arithmetic (at least +,-,*).

However, the following

Context('NondecimalBase');
setBase(5);
$a = Compute('240+113');

correctly parses 240 in base 5, 113 in base, however also adds 240+113 in base 10 (353), then tries to convert 353 from base 5 and I get an error. Do I need to override +,-,* ?

@dpvc
Copy link
Member

dpvc commented Feb 16, 2023

Sorry for not getting back to you earlier on this. I looked at it, but didn't have time to write initially.

I haven't tried changing anything, but my firth thought is that you are trying to do conversion both to and from the base all in the context::NondecimalBase::Number class that replaces Parser::Number, and you have not replaced Value::Real for this context. These two need to work in concert to implement the new base.

The MathObjects library consists of two packages, Parser and Value. The Parser classes implement the parsing of a string into its abstract syntax tree, while the Value packages implement the different mathematical objects and the actions you can take with them (like adding them, and so on). The Value package was originally intended to be able to be used separately from Parser for computational purposes, but they have become rather intertwined over time, but the original separation into parsing and computation is still present.

Your context::NondecimalBase::Number is the parsing portion of the alternative-base setup, and it should recognize a number in the alternative base (as it currently does), but its eval() method should produce a (subclass of) Value::Real that is designed to do the computation for those numbers. That subclass will inherit all the functionality of an actual Value::Real, and you simply need to override its output functions in order to have it display its underlying real in the required base.

The current eval() method doesn't return a Value::Real, but rather a string, and that is not going to operate properly. So the first step to making this work is to convert context::Hex::Hex to context::NondecimalBase::Real with the output routines converting the base-10 real that is its value to the required base output. Then make your eval() function in context::NondecimalBase::Number return the Package('Real') as it did in the Hex context.

I will make more comments in the code itself, but that, I think, is the main issue.

Copy link
Member

@dpvc dpvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some extensive comments, here, that I hope will help you get this working.

macros/contexts/contextNondecimalBase.pl Outdated Show resolved Hide resolved
macros/contexts/contextNondecimalBase.pl Outdated Show resolved Hide resolved
macros/contexts/contextNondecimalBase.pl Outdated Show resolved Hide resolved
macros/contexts/contextNondecimalBase.pl Outdated Show resolved Hide resolved
macros/contexts/contextNondecimalBase.pl Outdated Show resolved Hide resolved
macros/contexts/contextNondecimalBase.pl Outdated Show resolved Hide resolved
sub setBase {
my $b = shift;
return Value::Error('The base must be greater than 1 and less than or equal to $max_base')
unless $b >= 2 && $b <= scalar(@$digits16);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no $max_base variable, and if you are using $digits16, you know the highest is 16.

But since your convert() allows for specifying a different set of digits, it may be that you want to allow the digits to be specified here. Then you can tell what the base is and can check that you have the right number of digits (rather than the other way around), and can create the digits array and reverse mapping once instead of doing it in convert() each time. The base, the digits, and the reverse mapping can be stored as properties in the context itself, rather than as globals. If you set them up here, you won't need $digits16 or $digit16any more (just use['0' ... '9', 'A' .. 'F']` as the default).

You are right to set the $context->{pattern}{number} based on the actual digits (but should only use the ones up to the base requested), though you will need to do $context->update after doing so.

If you subclass the Context object to add convert() as a method, you should also make setBase() a method as well. Otherwise, you will need to pass the context whose base you are setting as an argument to setBase().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you subclass the Context object to add convert() as a method, you should also make setBase() a method as well. Otherwise, you will need to pass the context whose base you are setting as an argument to setBase().

I think I'm also missing something here. This would be a subclass to Parser::Context ?

macros/contexts/contextNondecimalBase.pl Outdated Show resolved Hide resolved
macros/contexts/contextNondecimalBase.pl Outdated Show resolved Hide resolved
macros/contexts/contextNondecimalBase.pl Outdated Show resolved Hide resolved
@pstaabp
Copy link
Member Author

pstaabp commented Mar 7, 2023

@dpvc when you get a chance to look now. Lot's of cleanup, but adding isn't working right now. I'm testing 240+113 each in base 5, and it seems like Parser::BOP::add is being called with the two number interpreted in decimal. Do I need to override this to first convert from base b to 10 and then add?

@pstaabp
Copy link
Member Author

pstaabp commented Mar 22, 2023

@dpvc What I'm working on now is to be able to do $sum = Compute('123+401'); in the given base set as a flag.

I have something working, but seems like I'm going about this the right way. I have a class context::NondecimalBase::BOP::add and in the eval, I have

sub _eval {
	my ($self, $a, $b) = @_;
	my $base = $self->{equation}{context}{flags}{base};

	return context::NondecimalBase::convert(context::NondecimalBase::convert($a, from => $base) +
		context::NondecimalBase::convert($b, from => $base), to => $base);
}

but is there a way to have the parser automatically convert the operands for the operations +,-,* ? (I might drop / )

@dpvc
Copy link
Member

dpvc commented Mar 22, 2023

I have a class context::NondecimalBase::BOP::add

You do not need that, but I think you misunderstand the way it is supposed to work. Your context::NondecimalBase::Number should produce a context::NondecimalBase::Real as the result of its eval() method (yours currently produces a string; see below). The context::NondecimalBase::Real holds its value as a base-10 real, and all the computation is done by Perl on those base 10 values. That is all handled by the inherited methods from Value::Real. The only real purpose of context::NondecimalBase::Real is to convert the base-10 internal value to the proper base in its string() and TeX() methods. That's it. Your string() method is correct, but your TeX() one is not, as it takes the string() output and converts it again to the new base. You simply want

sub TeX {
	my $self = shift;
	my $base = $self->{context}{flags}{base};
	return '\text{' . $self->string . '}';
}

for that.

To fix your context::NondecimalBase::Number, you need to remove the line

	return context::NondecimalBase::convert($self->{value}, to => $base);

which (incorrectly) returns a string (the same string the student entered, which was converted to base 10 and then back, for no apparent reason). That lets it fall through to

	$self->Package('Real')->make($self->context, $self->{value});

which produces the context::NondecimalBase::Real with the proper base-10 value.

I also think you may not have the right idea about having a subclass of the Context class, but that is a separate issue for now. See if you can get the computation working first, and just set the base flag by hand for now.

@pstaabp
Copy link
Member Author

pstaabp commented Mar 22, 2023

@dpvc thanks for the help. I think I have a better handle on this now. If $n is a context::Nondecimal::Real, then $n->value is the base-10 representation of the number and $n->string is the base b representation.

And addition is working. That is Compute('240+113') using base-5 is 403.

@dpvc
Copy link
Member

dpvc commented Mar 22, 2023

If $n is a context::Nondecimal::Real, then $n->value is the base-10 representation of the number and $n->string is the base b representation.

Exactly.

And addition is working.

Great! Good job. The next step is to straighten out the context. Let me know when you are ready to do that.

@pstaabp
Copy link
Member Author

pstaabp commented Mar 23, 2023

Seems to be mostly working now. I'm trying to build a limited version of the context as well so students aren't allowed to put in operations.

@pstaabp
Copy link
Member Author

pstaabp commented Mar 25, 2023

I wouldn't call this "ready for prime-time". Maybe ready for late-night viewing instead, but quite functional and passing all its tests!!!

I added a 'LimitedNondecimalContext' as well which (consistent with other packages) allows only numerical values. Also, the code is cleanup and much better documentation.

Here are two sample problems using these Contexts.
nondecimal.pg.txt
nondecimal2.pg.txt

@pstaabp pstaabp force-pushed the convert-nondecimal-base branch 2 times, most recently from 44feb3a to aa739da Compare March 29, 2023 09:51
@somiaj
Copy link
Contributor

somiaj commented Mar 29, 2023

I notice two things testing this out, first the example given with different letters for digits doesn't seem to work:

Context('NondecimalBase');
Context()->flags->set(base => 12, digits => [0..9,'T','E']);

$test = Compute("3TE");

I get an error that 'TE' is not defined in this context;, and I think this is because the pattern for numbers has not been updated to deal with the new characters, $context->{pattern}{number} = '[0-9A-F]+';.

Second, if I pick two characters in the A-F range, I don't get any errors, but then I don't get the same characters back.

Context('NondecimalBase');
Context()->flags->set(base => 12, digits => [0..9,'B','D']);

$test = Compute("3BD");
BEGIN_PGML
[$test] = [_]{$test}
END_PGML

This gives me 3AB instead of 3BD as the output. I also see 3AB as the correct answer not 3BD.

@pstaabp
Copy link
Member Author

pstaabp commented Mar 30, 2023

I had tested this with '8E', but E is by definition defined, but hadn't tested '8T'. It appears that this is being parsed as 8*T, however '8E' is being parsed as the number '8E'.

@pstaabp
Copy link
Member Author

pstaabp commented Apr 1, 2023

Another update with the error that @somiaj noticed is fixed. More testing I determined that the digits need to be updated any time the base is changed. (For example if we go from base 12 to base 16).

I couldn't figure out how to run code when the Context()->flags->set(...) is called, so determined that the digits used need to be updated each time to ensure the correct digits are used. If anyone has some ideas, so those two lines (noted the code) would only need to be run when the base changed.

@pstaabp pstaabp force-pushed the convert-nondecimal-base branch 2 times, most recently from 541a93c to 44fa83d Compare April 7, 2023 13:01
@pstaabp
Copy link
Member Author

pstaabp commented Apr 18, 2023

@dpvc Can you take a look at this again when you get a chance?

@Alex-Jordan
Copy link
Contributor

So I think I was able to do what I described. That is, make this happen:

Context('NondecimalBase')->setBase(5);
TEXT(Real(9));     # prints 14, which is nine in base five.
TEXT(Real('9'));   # raises an error because '9' is not a base 5 digit.

The question is if everyone thinks this is a good distinction to have. The TLDR of my post before was to consistently make strings be interpreted with the alternate base, and plain perl integers be interpreted base ten.

@Alex-Jordan
Copy link
Contributor

Let's forget what I was trying to do in the last few posts. While I can get Real(9) and Real('9') to be distinguished, it will not carry over easily to things like List(8, 9) and List('8', '9'), etc. The distinction is too fragile.

I opened a PR (pstaabp#22) to your branch with four commits:

  • some light editing to the documentation. This includes clarifying when MathObject argument strings will have numerical substrings interpreted in base ten and when in the context's base.
  • I added some strings you can use for setBase. The most extreme is ...->setBase('base64') as a short way to declare the 64 digits for that base.
  • Since this is all integer stuff, I added a modulo operator %.
  • I changed the file name (and all the package naming etc) to BaseN instead of NonDecimalBase. If you want to keep that name, can just cut the last commit. Although the last commit also has perltidy, so if it is cut, it needs another perltidy application.

@pstaabp
Copy link
Member Author

pstaabp commented Feb 13, 2024

Just merged in the suggestions from @Alex-Jordan and added a few more unit tests.

greater than or equal to 2. The numbers will be stored internally in decimal, though parsed
and shown in the chosen base.

In addition, basic integer arithemetic (+,-,*,/,^) are available for these numbers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have mentioned % here.

Copy link
Contributor

@Alex-Jordan Alex-Jordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Officially marking for approval.

Convert [$a] to base-5:

[$a] = [__]{$a_5}[`_5`]
END_PGML
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, this is almost a complete PG file but missing the last line.

@Alex-Jordan
Copy link
Contributor

I marked for approval. But then remembered one last thing I had thought to add. I opened a PR to your branch about that.

Beyond that, was there any consideration for case insensitivity? Like having hexadecimal, but letting 10AE equal 10ae? I know it could be more trouble than it's worth when you consider a digit set that contains both capital and lowercase letters. But assuming the only letters in the digit set were capital, you could imagine allowing case insensitivity. Maybe that is just enabling inattentive students...

@pstaabp
Copy link
Member Author

pstaabp commented Feb 14, 2024

I marked for approval. But then remembered one last thing I had thought to add. I opened a PR to your branch about that.

Beyond that, was there any consideration for case insensitivity? Like having hexadecimal, but letting 10AE equal 10ae? I know it could be more trouble than it's worth when you consider a digit set that contains both capital and lowercase letters. But assuming the only letters in the digit set were capital, you could imagine allowing case insensitivity. Maybe that is just enabling inattentive students...

Maybe on a case-by-case basis. Clearly on your new base64 example, case sensitivity is important.

@drgrice1 drgrice1 merged commit 09abcf3 into openwebwork:develop Feb 14, 2024
2 checks passed
@pstaabp pstaabp deleted the convert-nondecimal-base branch February 14, 2024 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants