perl - Perl OOP attribute manipulation best practice

Question

Assume the following code:

package Thing;
sub new {
    my $this=shift;
    bless {@_},$this;
}
sub name {
    my $this=shift;
    if (@_) {
        $this->{_name}=shift;
    }
    return $this->{_name};
}

Now assume we've instantiated an object thusly:

my $o=Thing->new();
$o->name('Harold');

Good enough. We could also instantiate the same thing more quickly with either of the following:

my $o=Thing->new(_name=>'Harold');  # poor form
my $o=Thing->new()->name('Harold');

To be sure, I allowed attributes to be passed in the constructor to allow "friendly" classes to create objects more completely. It could also allow for a clone-type operator with the following code:

my $o=Thing->new(%$otherthing);  # will clone attrs if not deeper than 1 level

This is all well and good. I understand the need for hiding attributes behind methods to allow for validation, etc.

$o->name;  # returns 'Harold'
$o->name('Fred'); # sets name to 'Fred' and returns 'Fred'

But what this doesn't allow is easy manipulation of the attribute based on itself, such as:

$o->{_name}=~s/old/ry/;  # name is now 'Harry', but this "exposes" the attribute

One alternative is to do the following:

# Cumbersome, not syntactically sweet
my $n=$o->name;
$n=~s/old/ry/;
$o->name($n);

Another potential is the following method:

sub Name :lvalue {  # note the capital 'N', not the same as name
    my $this=shift;
    return $this->{_name};
}

Now I can do the following:

$o->Name=~s/old/ry/;

So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way? I mean, doing that takes away any validation that might be found in the 'name' method. For example, if the 'name' method enforced a capital first letter and lowercase letters thereafter, the 'Name' (capital 'N') bypasses that and forces the user of the class to police herself in the use of it.

So, if the 'Name' lvalue method isn't exactly "kosher" are there any established ways to do such things?

I have considered (but get dizzy considering) things like tied scalars as attributes. To be sure, it may be the way to go.

Also, are there perhaps overloads that may help?

Or should I create replacement methods in the vein of (if it would even work):

sub replace_name {
    my $this=shift;
    my $repl=shift;
    my $new=shift;
    $this->{_name}=~s/$repl/$new/;
}
...
$o->replace_name(qr/old/,'ry');

Thanks in advance... and note, I am not very experienced in Perl's brand of OOP, even though I am fairly well-versed in OOP itself.

Additional info: I guess I could get really creative with my interface... here's an idea I tinkered with, but I guess it shows that there really are no bounds:

sub name {
    my $this=shift;
    if (@_) {
        my $first=shift;
        if (ref($first) eq 'Regexp') {
            my $second=shift;
            $this->{_name}=~s/$first/$second/;
        }
        else {
            $this->{_name}=$first;
        }
    }
    return $this->{_name};
}

Now, I can either set the name attribute with

$o->name('Fred');

or I can manipulate it with

$o->name(qr/old/,'ry');  # name is now Harry

This still doesn't allow stuff like $o->name.=' Jr.'; but that's not too tough to add. Heck, I could allow calllback functions to be passed in, couldn't I?

score 4 · Accepted Answer

Your first code example is abolutely fine. This is a standard method to write accessors. Of course this can get ugly when doing a substitution, the best solution might be:

$o->name($o->name =~ s/old/ry/r);

The /r flag returns the result of the substitution. Equivalently:

$o->name(do { (my $t = $o->name) =~ s/old/ry/; $t });

Well yes, this 2nd solution is admittedly ugly. But I am assuming that accessing the fields is a more common operation than setting them.

Depending on your personal style preferences, you could have two different methods for getting and setting, e.g. name and set_name. (I do not think get_ prefixes are a good idea – 4 unneccessary characters).

If substituting parts of the name is a central aspect of your class, then encapsulating this in a special substitute_name method sounds like a good idea. Otherwise this is just unneccessary ballast, and a bad tradeoff for avoiding occasional syntactic pain.

I do not advise you to use lvalue methods, as these are experimental.

I would rather not see (and debug) some “clever” code that returns tied scalars. This would work, but feels a bit too fragile for me to be comfortable with such solutions.

Operator overloading does not help with writing accessors. Especially assignment cannot be overloaded in Perl.

Writing accessors is boring, especially when they do no validation. There are modules that can handle autogeneration for us, e.g. Class::Accessor. This adds generic accessors get and set to your class, plus specific accessors as requested. E.g.

package Thing;
use Class::Accessor 'antlers';  # use the Moose-ish syntax
has name => (is => 'rw');  # declare a read-write attribute

# new is autogenerated. Achtung: this takes a hashref

Then:

Thing->new({ name => 'Harold'});
# or
Thing->new->name('Harold');
# or any of the other permutations.

If you want a modern object system for Perl, there is a row of compatible implementations. The most feature-rich of these is Moose, and allows you to add validation, type constraints, default values, etc. to your attributes. E.g.

package Thing;
use Moose; # this is now a Moose class

has first_name => (
  is => 'rw',
  isa => 'Str',
  required => 1, # must be given in constructor
  trigger => \&_update_name, # run this sub after attribute is set
);
has last_name => (
  is => 'rw',
  isa => 'Str',
  required => 1, # must be given in constructor
  trigger => \&_update_name,
);
has name => (
  is => 'ro',  # readonly
  writer => '_set_name', # but private setter
);

sub _update_name {
  my $self = shift;
  $self->_set_name(join ' ', $self->first_name, $self->last_name);
}

# accessors are normal Moose methods, which we can modify
before first_name => sub {
  my $self = shift;
  if (@_ and $_[0] !~ /^\pU/) {
    Carp::croak "First name must begin with uppercase letter";
  }
};

score 3 · Accepted Answer

The purpose of class interface is to prevent users from directly manipulating your data. What you want to do is cool, but not a good idea.

In fact, I design my classes, so even the class itself doesn't know it's own structure:

package Thingy;

sub new {
    my $class = shift;
    my $name  = shift;

    my $self = {};
    bless, $self, $class;
    $self->name($name);
    return $self;
}

sub name {
    my $self = shift;
    my $name = shift;

    my $attribute = "GLUNKENSPEC";
    if ( defined $name ) {
        $self->{$attribute} = $name;
    }
    return $self->{$attribute};
}

You can see by my new constructor that I could pass it a name for my Thingy. However, my constructor doesn't know how I store my name. Instead, it merely uses my name method to set the name. As you can see by my name method, it stores the name in an unusual way, but my constructor doesn't need to know or care.

If you want to manipulate the name, you have to work at it (as you showed):

my $name = $thingy->name;
$name =~ s/old/ry/;
$thingy->name( $name );

In fact, a lot of Perl developers use inside out classes just to prevent this direct object manipulation.

What if you want to be able to directly manipulate a class by passing in a regular expression? You have to write a method to do this:

 sub mod_name {
    my $self        = shift;
    my $pattern     = shift;
    my $replacement = shift;

    if ( not defined $replacement ) {
        croak qq(Some basic error checking: Need pattern and replacement string);
    }

    my $name = $self->name;     # Using my name method for my class
    if ( not defined $name ) {
       croak qq(Cannot modify name: Name is not yet set.);
    }

    $name = s/$pattern/$replacement/;
    return $self->name($name);
}

Now, the developer can do this:

my $thingy->new( "Harold" );
$thingy->mod_name( "old", "new" );
say $thingy->name;   # Says "Harry"

Whatever time or effort you save by allowing for direct object manipulation is offset by the magnitude of extra effort it will take to maintain your program. Most methods don't take more than a few minutes to create. If I suddenly got an hankering to manipulate my object in a new and surprising way, it's easy enough to create a new method to do this.

^1. No. I don't actually use random nonsense words to protect my class. This is purely for demo purposes to show that even my constructor doesn't have to know how methods actually store their data.

score 2 · Accepted Answer

I understand the need for hiding attributes behind methods to allow for validation, etc.

Validation is not the only reason, although it is the only one you refer to. I mention this because another is that encapsulation like this leaves the implementation open. For example, if you have a class which needs to have a string "name" which can be get and set, you could just expose a member, name. However, if you instead use get()/set() subroutines, how "name" is stored and represented internally doesn't matter.

That can be very significant if you write bunches of code with uses the class and then suddenly realize that although the user may be accessing "name" as a string, it would be much better stored some other way (for whatever reason). If the user was accessing the string directly, as a member field, you now either have to compensate for this by including code that will change name when the real whatever is changed and...but wait, how can you then compensate for the client code that changed name...

You can't. You're stuck. You now have to go back and change all the code that uses the class -- if you can. I'm sure anyone who has done enough OOP has run into this situation in one form or another.

No doubt you've read all this before, but I'm bringing it up again because there are a few points (perhaps I've misunderstood you) where you seem to outline strategies for changing "name" based on your knowledge of the implementation, and not what was intended to be the API. That is very tempting in perl because of course there is no access control -- everything is essential public -- but it is still a very very bad practice for the reason just described.

That doesn't mean, of course, that you can't simply commit to exposing "name" as a string. That's a decision and it won't be the same in all cases. However, in this particular case, if what you are particularly concerned with is a simple way to transform "name", IMO you might as well stick with a get/set method. This:

# Cumbersome, not syntactically sweet

Maybe true (although someone else might say it is simple and straightforward), but your primary concern should not be syntactic sweetness, and neither should speed of execution. They can be concerns, but your primary concern has to be design, because no matter how sweet and fast your stuff is, if it is badly designed, it will all come down around you in time.

Remember, "Premature optimization is the root of all evil" (Knuth).

score 2 · Accepted Answer

So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way?

It boils down to: Will this continue to work if the internals change? If the answer is yes, you can do many other things including but not limited to validation.)

The answer is yes. This can be done by having the method return a magical value.

{
   package Lvalue;
   sub TIESCALAR { my $class = shift; bless({ @_ }, $class) }
   sub FETCH { my $self = shift; my $m = $self->{getter}; $self->{obj}->$m(@_) }
   sub STORE { my $self = shift; my $m = $self->{setter}; $self->{obj}->$m(@_) }
}

sub new { my $class = shift; bless({}, $class) }

sub get_name {
   my ($self) = @_;
   return $self->{_name};
}

sub set_name {
   my ($self, $val) = @_;
   die "Invalid name" if !length($val);
   $self->{_name} = $val;
}

sub name :lvalue {
   my ($self) = @_;
   tie my $rv, 'Lvalue', obj=>$self, getter=>'get_name', setter=>'set_name';
   return $rv;
}

my $o = __PACKAGE__->new();
$o->name = 'abc';
print $o->name, "\n";    # abc
$o->name = '';           # Invalid name

perl - Perl OOP attribute manipulation best practice

4 回答 4

Related

Reference