The Effect of Portfolio-Based Assessment on Jordanian EFL Learners ’ Writing Performance

This study examines the effect of portfolio assessment on Jordanian EFL tenth grade learners’ overall writing performance and their performance on the sub-skills of focus, development, organization, conventions and word choice. The study is quasi-experimental in which an experimental group and a control group of 20 students each were purposefully drawn from tenth grade classes at the public schools for girls in the North-Eastern Badia Directorate of Education. The experimental group was instructed on how to generate ideas, structure, draft, and edit their written pieces following Hamp-Lyons and Condon’s (2000) model while the control group was instructed conventionally as prescribed in the Teacher’s Book. The findings revealed that the portfolio group outperformed the conventionally-instructed group (at α≤ 0.05) in their overall writing performance and in their performance on the writing sub-skills of focus, development, organization, conventions and word choice.


Introduction
The portfolio has emerged as a viable assessment tool since the 1990s, as educational practitioners have sought alternative, more authentic means of assessment to align with the conceptions of teaching and learning that place more emphasis on the learners' evolution.The portfolio, a collection of a learner's best work, not only documents learner progress over time but also encourages him/her to become more self-directed, take the initiative for learning, make judgments, and participate in the evaluation of his/her own work and solve emerging problems (Crosby, 1997;Gosselin, 1998;Yang, 2003).
That portfolios have been recognized as an important educational assessment tool is perhaps not surprising as there have been several areas in which they are recognized for contributing significantly to the assessment and learning process.For instance, the range and comprehensiveness of the evidence they provide and the variety and flexibility of the purposes they serve (Julius, 2000) have been remarked.They have been reported to help document growth over time (e.g., Politano, Cameron, Tate & MacNaughton, 1997;Tierney, Carter & Desai, 1991), both in process-and product-related learning (e.g., Costa & Kallick, 2000;Gillespie, Ford, Gillespie & Leavell, 1996), to provide data for out-of-class assessment (e.g., Fritz, 2001;Willis, 2000), and to inform instructional decision-making (e.g., Arter & Spandel, 1992;Gillespie et al, 1996).Also, the potential to allow students to reflect on what they have accomplished (Lam, Bellaterra Journal of Teaching & Learning Language & Literature.9. 1 (Feb-Mar 2016) ISSN 2013-6196 2011) and that to increase students' motivation and opportunities for autonomous learning (Crosby, 1997) have contributed significantly to the popularity of portfolios in classroom assessment.
The positive aspects of the portfolio have been extended to specific content areas of learning as well.A plethora of research acknowledges portfolios as a promising alternative to traditional instruction and assessment, both in the first and second/foreign language classroom.

Problem, Purpose, Questions and Hypotheses of the Study
Following the shift from traditional teacher-centered assessment to 'alternative' student-centered assessment in the language classroom, portfolios has received the lion's share of attention as a tool which addresses not only assessment but also teaching and learning alike.However, despite the widely reported prospective gains (e.g., Apple & Shimo, 2004;Marefat, 2004), research on portfolios in the Jordanian language teaching context has lagged behind, which is further reflected in virtually non-existent portfolio-based pedagogical practices in the Jordanian classroom.
Traditional writing strategies and (summative) timed tests are still the norm in the Jordanian EFL classroom, which may be partially accountable for reports of poor writing performance for students throughout primary and secondary education.The reportedly far from satisfactory realities of foreign language instruction in general and writing instruction in particular, which is consistent with international accounts (e.g., Harder, 2006;Moon, 2008) of writing as the neglected skill, have prompted these researchers to seek an alternative approach to writing instruction and assessment in the Jordanian EFL classroom.Thus, the study examines the potential effect of portfolio assessment on Jordanian EFL tenth grade students' writing performance, both overall and on the writing sub-skills of focus, development, organization, conventions and word choice.
It is worth noting that this study adopts assessment more as a central contributor to the instructional process rather than an end in itself.These researchers use assessment formatively to monitor learning and provide ongoing feedback to help students identify their strengths and weaknesses and target areas that need work, as opposed to summative assessment which evaluates student learning at the end of an instructional unit against a set of standards or benchmarks.
More specifically, the study attempts to answer the following questions:

Significance of the Study
As portfolio assessment is hardly ever used in the Jordanian EFL context, except probably for few isolated research initiatives by in-service teachers for graduate work (e.g., Alnethami, 2009), this research may not only add to the existing literature but also set an example for further similar research in Jordan and other similar EFL contexts.Furthermore, as experienced EFL practitioners, these researchers realize that writing, often dubbed the neglected skill, is almost always given the lowest priority relative to the other three skills (e.g., Al-Gomoul, 2011;Al-Jarf, 2007;Hyland, 2003;Soles, 2005) and, thus, continues to need special attention in the EFL classroom.
This study is further meant to provide information for teachers, curriculum designers and other stakeholders concerned with reforming foreign language instruction, in particular in Jordan but is equally applicable in other similar contexts, as the role of portfolio assessment for effective improving EFL students' writing performance is demonstrated herein.

Methods and Procedures
The about the nature and purpose of the study and answered all their queries prior to obtaining their consent to participate in the research.
The study followed the quasi-experimental control/experimental group design.Three variables were examined: the independent variable of portfolio assessment and the two dependent variables of overall writing performance and writing performance in the sub-skills of focus, development, organization, conventions and word choice.
To achieve the purpose of the study, the researchers made use of several instruments: pre/post tests, Portfolio Assessment Model and Analytical Scoring Rubric 1 . 1.
The pre-test, in which the participants of both groups were asked to write a 100word essay about trees was administered to the experimental and control groups prior to the treatment to determine potential significant differences in their overall writing performance and that on the five sub-skills of focus, development, organization, conventions and word choice.The choice of the topics for both pre-and post test essays was driven by the content of the student textbook, to avoid overwhelming them with unduly difficult or uninteresting topics.

2.
The post-test, in which the participants of the control group only were asked to write a 100-word essay about rainforests, was administered at the end of the experiment.

3.
The Portfolio Assessment Model, put forth by Hamp-Lyons and Condon (2000), was adopted to collect data from the portfolio assessment group.The Model consists of three procedures: collection (in which the learner is expected to collect his/her final drafts in a portfolio), selection (in which the learner is expected to select the best three final drafts for summative grading), and reflection (in which the learner is expected to reflect upon the first and the final drafts).
The Analytic Scoring Rubric, adapted from Wang and Laio (2008), consisted of the five subskills of focus, development, organization, conventions and word choice, each with six levels.
Each of the five sub-skill is rated on a scale from zero to five along a set of specific descriptors.
For example, the excerpt below illustrates the scale used in assessing the sub-skill of focus.Fully addressing the writing task Each participants' score was the mean of two raters' scores (out of 25).
The validity of the instruments was established by referring them to a jury of Jordanian university professors in education, measurement and evaluation and curriculum and instruction.
The jury's comments and recommendations (e.g., rearranging, merging and deleting items, adjusting the weights for the writing sub-skills in the rubric) were all taken into account and reflected in the final versions of these instruments.Similarly, the reliability of the instruments was also established.The pre-and post tests were administered to two comparable groups of tenth grade students from the North-Eastern Badia Directorate of Education, which were excluded from the main sample of the study, allowing a three-week interval between the two administrations.The reliability coefficient for the pre-test amounted to 0.96 and that for the posttest to 0.89, both considered appropriate for the purposes of this research.
Furthermore, intra-and inter-rater reliability of scoring was also established by asking another rater to use the Rubric to assess a sample of 15 students' responses on the pre-test.Both raters individually evaluated the same sample of pre-test responses using the Rubric.The intrarater reliability coefficients for the two raters and their inter-rater reliability coefficient amounted to 0.89, 0.86 and 0.92, respectively, which are all appropriate for the purpose of this research.
Two tenth-grade sections from a purposefully-chosen school for girls in the North-Eastern Badia Directorate of Education constituted the sample of the study.The participants of the experimental group and the control group were all pre-tested by writing an essay of about 100 words about trees.A number of lesson plans based on Hamp-Lyons and Condon's (2000) portfolio model were designed and used to teach the experimental group as follows: At the beginning of the treatment, the instructor/first researcher illustrated the design, objective and procedure of the Portfolio Model and allowed the students to practice writing on topics from their textbook, Action Pack 10.He marked the student's first drafts and provided feedback on each per the five sub-skills in the Rubric (viz., focus, development, organization, conventions and word choice).After allowing them time to ponder the feedback, the participants were asked to reflect on their own writing.After their self-assessment, they were asked to exchange papers and assess each other's written pieces, after which further reflection was expected in light of the instructor and peer feedback.The instructor/first researcher was available for clarification and further feedback, either individually or in groups, throughout the sessions and in the after-session recess.
By contrast, the Control group was instructed conventionally per the instructions of the Teacher Book: Every session, the instructor/first researcher introduced the topic of the day and then wrote it on the board.He then reminded the students to write a topic sentence, support the main idea with some detail, and then restate their main idea in the conclusion.He further directed them on how to generate ideas, organize them and draft their essays, all within the session.The students sat quietly, thinking and writing down sentences.When done, the essays were read aloud for the whole class.Further revisions were assigned homework before submission the following session.No pair or group work was allowed.
At the end of the treatment, the students in the experimental group were each asked to choose three of their best essays for final assessment.A student's score is the average of the scores of these three essays, based on the five criteria of the Rubric (viz., focus, development, organization, conventions and word choice) which were each divided into five sub-levels.Every student received a composite score of 25 (further made of the average of the two raters' scores).
The control group writing performance was assessed based on the post-test in which they were asked to write an essay of about 100 words about rainforests.
For data analysis, means and standard deviations were used to compare the writing performance of the experimental and control groups.ANCOVA was also used to control the differences between the groups before the treatment and to detect any significant differences (at α≤ 0.05) between the experimental group and the control group which can be attributed to the treatment.

Findings of the Study
Drawing on information from the relevant sources of data obtained in the course of the study, each research question is addressed by testing the relevant hypothesis.To test the first hypothesis, portfolio assessment has no statistically significant effect (at α≤ 0.05) on Jordanian EFL tenth grade students' overall writing performance, descriptive statistics were obtained, as shown in   Table 3 shows differences in the means, standard deviations and the adjusted mean scores on the post-test and the portfolio assessment between the experimental group and the control group performance on the sub-skills of writing in favor of the experimental group.Table 4 shows statistically significant differences on students' performance on the writing sub-skills of conventions, word choice, organization, development and focus respectively.Thus, the null hypothesis, portfolio assessment has no statistically significant effect (at α≤ 0.05) on Jordanian tenth grade EFL learners' writing performance on the sub-skills of focus, development, organization, conventions and word choice, is rejected.

Limitations of the Study
The potential generalizability of the findings may be limited by a number of factors which could not have been avoided.First, the experiment only targeted intact sections of female tenth grade students in the public schools of North-Eastern Badia Directorate of Education over a period of three months in the first semester of the academic year 2014/2015.Not only would a larger sample and longer duration have provided better data, but having both male and female students would have enhanced the generalizability of the findings.Second, had a teacher, other than the first researcher, taught both the experimental and control groups, it would have added to the credibility of the findings and ruled out any potential shades of bias.However, that the experiment was conducted in North-Eastern Badia, inaccessibly remote for anyone from another area, accounted for not finding any volunteers to teach the groups, and thus the first researcher ended up teaching both groups.Third, the researchers had initially intended to video-tape the experiment, but the conservative nature of the community prompted the participants and their teachers and school principals to ask that sessions not be videotaped.Even though nothing has escaped documentation, the researchers would have felt more confident with the hard evidence provided by the recordings.
participants of this study were 40 female Jordanian tenth grade EFL students purposefully chosen from the public schools in the North-Eastern Badia Directorate of Education.The experimental group consisted of 20 students and was taught through the Portfolio Assessment Model (detailed below).The control group consisted of 20 students and was taught per the guidelines of the Teacher Book (also detailed below).To collect the data, the participants' and the school principal's consent to participate in the study was obtained.Permission to use the data was obtained through the school participation.The participants were informed by the researcher Bellaterra Journal of Teaching & Learning Language & Literature.9.1 (Feb-Mar 2016) ISSN 2013-6196 To achieve the purpose of the study, these questions are rephrased into two null hypotheses, 1. To what extent does portfolio assessment affect Jordanian EFL students' overall writing performance?2. To what extent does portfolio assessment affect Jordanian EFL students' writing performance on the sub-skills of focus, development, organization, conventions and word choice?Bellaterra Journal of Teaching & Learning Language & Literature.9.1 (Feb-Mar 2016) ISSN 2013-6196

Table 1
shows differences in the means and standard deviations of the experimental and the control group which are 3.75 with standard deviation of 1.37 for the experimental group and 6.55 with standard deviation of 2.21 for the control group.There were also differences in the adjusted mean scores of the experimental group and the control group on the post-test and the portfolio assessment in favor of the experimental group.

Table 4 : ANCOVA of the Students' Performance on the Portfolio Assessment and the Post- Test in the Various Writing Sub-skills
Bellaterra Journal of Teaching & Learning Language & Literature.9.1 (Feb-Mar 2016)ISSN 2013-6196